dhgrs / chainer-VQ-VAE

A Chainer implementation of VQ-VAE.
82 stars 19 forks source link

vq loss raises #7

Closed barby1138 closed 5 years ago

barby1138 commented 6 years ago

loss3

Hi, I try to play with vq-vae and i see its loss raises. Can you suggest anything? I do not observe such behaviour at your diagrams

dhgrs commented 6 years ago

Hi,

The loss keeps raising up? Loss2 and loss3 are regularisation term so sometimes raise up. And y-axis in my result are very large scale. It has same behavior.

Please keep on training about 100k iterations and check the generated results.

barby1138 commented 6 years ago

Thanks for reply, seems it stabilized. Also while generating from 60k model I still here noices - is it OK?

loss2 loss3

dhgrs commented 6 years ago

How about loss1? If loss1 is lower than about 2.2(8bit=256 quantize), the model generates audible results in my experience.

barby1138 commented 6 years ago

loss1 seems OK loss1

some params of interest batchsize = 4 lr = 2e-4 dataset_type = 'VCTK' sr = 16000 quantize = 256 length = 7680

dhgrs commented 6 years ago

It seems wrong. Loss1 is too small. Please tell me

barby1138 commented 6 years ago

it's last source Latest commit 77be4eb on Aug 22 from audio branch

pip list Package Version


absl-py 0.2.2
alabaster 0.7.10
anaconda-client 1.6.14
anaconda-navigator 1.8.2
anaconda-project 0.8.0
asn1crypto 0.22.0
astor 0.6.2
astroid 1.5.3
astropy 2.0.2
audioread 2.1.5
Babel 2.5.0
backports-abc 0.5
backports.functools-lru-cache 1.4
backports.shutil-get-terminal-size 1.0.0
backports.ssl-match-hostname 3.5.0.1
backports.weakref 1.0.post1
beautifulsoup4 4.6.0
bitarray 0.8.1
bkcharts 0.2
blaze 0.11.3
bleach 1.5.0
bokeh 0.12.10
boto 2.48.0
boto3 1.7.4
botocore 1.10.4
Bottleneck 1.2.1
bz2file 0.98
cdecimal 2.3
certifi 2018.1.18
cffi 1.10.0
chainer 5.0.0rc1
chardet 3.0.4
click 6.7
cloudpickle 0.4.0
clyent 1.2.2
colorama 0.3.9
conda 4.4.6
conda-build 3.0.27
conda-verify 2.0.0
configparser 3.5.0
contextlib2 0.5.5
cryptography 2.0.3
cupy-cuda90 5.0.0rc1
cycler 0.10.0
Cython 0.26.1
cytoolz 0.8.2
dask 0.15.3
datashape 0.5.4
decorator 4.1.2
deepspeech-gpu 0.1.1
distributed 1.19.1
dm-sonnet 1.14
docutils 0.14
entrypoints 0.2.3
enum34 1.1.6
et-xmlfile 1.0.1
fastcache 1.0.2
fastrlock 0.3
filelock 2.0.12
fire 0.1.3
Flask 0.12.2
Flask-Cors 3.0.3
funcsigs 1.0.2
functools32 3.2.3.post2
future 0.16.0
futures 3.2.0
gast 0.2.0
gensim 3.4.0
gevent 1.2.2
glob2 0.5
gmpy2 2.0.8
greenlet 0.4.12
grin 1.2.1
grpcio 1.11.0
h5py 2.7.0
heapdict 1.0.0
html5lib 0.9999999
idna 2.6
imageio 2.2.0
imagesize 0.7.1
intervaltree 2.1.0
ipaddress 1.0.18
ipykernel 4.6.1
ipython 5.4.1
ipython-genutils 0.2.0
ipywidgets 7.0.0
isort 4.2.15
itsdangerous 0.24
jdcal 1.3
jedi 0.10.2
Jinja2 2.9.6
jmespath 0.9.3
joblib 0.11
jsonschema 2.6.0
jupyter-client 5.1.0
jupyter-console 5.2.0
jupyter-core 4.3.0
jupyterlab 0.27.0
jupyterlab-launcher 0.4.0
lazy-object-proxy 1.3.1
librosa 0.5.1
llvmlite 0.20.0
locket 0.2.0
lxml 4.1.0
magenta-gpu 0.3.5
Markdown 2.6.11
MarkupSafe 1.0
matplotlib 2.1.0
mccabe 0.6.1
mido 1.2.6
mir-eval 0.4
mistune 0.7.4
mock 2.0.0
mpmath 0.19
msgpack-python 0.4.8
multipledispatch 0.4.9
navigator-updater 0.1.0
nbconvert 5.3.1
nbformat 4.4.0
networkx 2.0
nltk 3.2.4
nose 1.3.7
notebook 5.0.0
numba 0.35.0+10.g143f70e90.dirty numexpr 2.6.2
numpy 1.14.5
numpydoc 0.7.0
odo 0.5.1
olefile 0.44
openpyxl 2.4.8
packaging 16.8
pandas 0.20.3
pandocfilters 1.4.2
partd 0.3.8
path.py 10.3.1
pathlib 1.0.1
pathlib2 2.3.0
patsy 0.4.1
pbr 4.0.4
pep8 1.7.0
pexpect 4.2.1
pickleshare 0.7.4
Pillow 4.2.1
pip 18.0
pkginfo 1.4.1
ply 3.10
pretty-midi 0.2.8
progressbar 2.5
prompt-toolkit 1.0.15
protobuf 3.6.0
psutil 5.4.0
ptyprocess 0.5.2
py 1.4.34
pyarrow 0.9.0
pycairo 1.13.3
pycodestyle 2.3.1
pycosat 0.6.3
pycparser 2.18
pycrypto 2.6.1
pycurl 7.43.0
pydub 0.22.1
pyflakes 1.6.0
Pygments 2.2.0
pylint 1.7.4
pyodbc 4.0.17
pyOpenSSL 17.2.0
pyparsing 2.2.0
PySocks 1.6.7
pytest 3.2.1
python-dateutil 2.6.1
python-rtmidi 1.1.0
pytz 2017.2
PyWavelets 0.5.2
pyworld 0.2.5
PyYAML 3.12
pyzmq 16.0.2
QtAwesome 0.4.4
qtconsole 4.3.1
QtPy 1.3.1
requests 2.18.4
resampy 0.2.0
rope 0.10.5
ruamel-yaml 0.11.14
s3transfer 0.1.13
scandir 1.6
scikit-image 0.13.0
scikit-learn 0.19.1
scipy 0.19.1
seaborn 0.8
setuptools 40.4.3
simplegeneric 0.8.1
singledispatch 3.4.0.3
six 1.11.0
smart-open 1.5.7
snowballstemmer 1.2.1
sortedcollections 0.5.3
sortedcontainers 1.5.7
SoundFile 0.10.2
Sphinx 1.6.3
sphinxcontrib-websupport 1.0.1
spyder 3.2.4
SQLAlchemy 1.1.13
statsmodels 0.8.0
subprocess32 3.2.7
svgwrite 1.1.6
sympy 1.1.1
tables 3.4.2
tabulate 0.8.2
tb-nightly 1.5.0a20180106
tblib 1.3.2
tensorboard 1.8.0
tensorflow-gpu 1.8.0
tensorflow-hub 0.1.0
tensorpack 0.8.6
termcolor 1.1.0
terminado 0.6
testpath 0.3.1
tf 1.0.0
tf-nightly 1.6.0.dev20180105
toolz 0.8.2
torch 0.4.1
tornado 4.5.2
tqdm 4.23.4
traitlets 4.3.2
typing 3.6.2
unicodecsv 0.14.1
urllib3 1.22
wcwidth 0.1.7
webencodings 0.5.1
Werkzeug 0.14.1
wheel 0.31.1
widgetsnbextension 3.0.2
wrapt 1.10.11
xlrd 1.1.0
XlsxWriter 1.0.2
xlwt 1.3.0
zict 0.1.3

params:

parameters of training

batchsize = 4 lr = 2e-4 ema_mu = 0.9999 trigger = (250000, 'iteration') evaluate_interval = (1, 'epoch') snapshot_interval = (10000, 'iteration') report_interval = (100, 'iteration')

parameters of dataset

root = '../VCTK-Corpus' dataset_type = 'VCTK' split_seed = 71

parameters of preprocessing

sr = 16000 res_type = 'kaiser_fast' top_db = 20 input_dim = 256 quantize = 256 length = 7680 use_logistic = False

parameters of VQ

d = 512 k = 512

parameters of Decoder(WaveNet)

n_loop = 3 n_layer = 10 filter_size = 2

input_dim = input_dim

residual_channels = 512 dilated_channels = 512 skip_channels = 256

quantize = quantize

use_logistic = use_logistic

n_mixture = 10 * 3 log_scale_min = -40 global_condition_dim = 128 local_condition_dim = 512 dropout_zero_rate = 0

parameters of losses

beta = 0.25

parameters of generating

use_ema = True apply_dropout = False

barby1138 commented 6 years ago

BTW I use python2 is it critical?

dhgrs commented 6 years ago

Ummm... Strange... I checked working with python3.5.2 and chainer4.0.0b3. And I don't know whether my code works in other environment. So can you try with same environment at first? FYI: I use docker to set up environment.

8bit(256) quantized WaveNet's loss(it is loss1 in this repo.) is about 4\~5 in very beginning of the training and goes down to 2\~2.5. It's wrong if the loss is such small value.

# I edited to fix markdown syntax error

dhgrs commented 6 years ago

And if you modify files other than params.py, please tell me.

barby1138 commented 6 years ago

He I ve restarted with python 3.6 and seems loss1 is OK now ~5

dhgrs commented 6 years ago

It's good! FYI: If your GPU usage is low, please set -p and -f option like python3 train.py -g 0 -f 64 -p 2. Details are in my README.

barby1138 commented 6 years ago

thanks for help

partha2409 commented 5 years ago

Hi. What is the y-axis range of loss2 and loss3?

dhgrs commented 5 years ago

@partha2409 The y-axis range is defined automatically by Chainer. And loss2 and loss3 are very large in early phase. So that the y-axis is so large range.

partha2409 commented 5 years ago

Hi @dhgrs , i am implementing vq vae in pytorch using your implementation as reference. For me the reconstruction loss looks fine . But loss2 and loss3 are very close to zero right from the initial iterations. It is similar to the graphs attached by @barby1138. But i notice in your graphs loss2 starts well around 48. is it fine or do you think i am making mistakes with loss2 and loss3?

dhgrs commented 5 years ago

Hi @partha2409, thanks for your interest!

I think you have not made a mistake. Loss2 and loss3 are regularisation terms, so the values are very unstable in initial iterations.