Problems creating new datafiles

yajiemiao / pdnn

PDNN: A Python Toolkit for Deep Learning. http://www.cs.cmu.edu/~ymiao/pdnntk.html

Apache License 2.0

225 stars 105 forks source link

TRAIN DNN

python $pdnndir/cmds/run_DNN.py --train-data "filename.pkl.gz" --valid-data "filename.pkl.gz" --nnet-spec "4:5:3" --wdir ./ --param-output-file dnn.mdl --cfg-output-file dnn.cfg

I get the following output:

[2015-11-10 13:20:47.589817] > ... building the model [2015-11-10 13:20:47.603441] > ... getting the finetuning functions [2015-11-10 13:20:48.612798] > ... finetuning the model /usr/lib/python2.7/dist-packages/numpy/core/_methods.py:55: RuntimeWarning: Mean of empty slice. warnings.warn("Mean of empty slice.", RuntimeWarning) [2015-11-10 13:20:48.614276] > epoch 1, training error nan (%) [2015-11-10 13:20:48.615054] > epoch 1, lrate 0.080000, validation error nan (%) [2015-11-10 13:20:48.619409] > epoch 2, training error nan (%) [2015-11-10 13:20:48.619491] > epoch 2, lrate 0.080000, validation error nan (%) [2015-11-10 13:20:48.622980] > epoch 3, training error nan (%) [2015-11-10 13:20:48.623059] > epoch 3, lrate 0.080000, validation error nan (%) [2015-11-10 13:20:48.626443] > epoch 4, training error nan (%)

and nothing change forever... Actually, I got this behavior using a lot of different datasets, but I reproduced it here with this simple example for clarity. Any idea about the problem? I got this problem on MacOSX 10.10, python 2.7.10 and on Linux SMP Debian 3.16.7, python 2.7.9, thus it should not be dependent on local python installations. Any help is more than welcome. Thanks! Fabio

This issue is related to this line of code: https://github.com/yajiemiao/pdnn/blob/master/learning/sgd.py#L71.

batch_size = 256, which is much larger than size of training data 3, leads to train_sets.cur_frame_num / batch_size = 0, leads to train_error = [], then leads to numpy.mean([]) emits a warning, as you see.

In one sentence: the boundary condition is not handled correctly.

I fixed this issue in my pull request, only changed several lines of code.

Below is output by running your script after fixing this issue (added one extra option --lrate "C:0.1:10" to stop it from running indefinitely).

[2015-12-12 10:42:00.854358] > ... building the model
[2015-12-12 10:42:00.864003] > ... getting the finetuning functions
[2015-12-12 10:42:02.142837] > ... finetuning the model
[2015-12-12 10:42:02.145008] > epoch 1, training error 66.666667 (%)
[2015-12-12 10:42:02.146348] > epoch 1, lrate 0.100000, validation error 33.333333 (%)
[2015-12-12 10:42:02.148447] > epoch 2, training error 33.333333 (%)
[2015-12-12 10:42:02.148744] > epoch 2, lrate 0.100000, validation error 33.333333 (%)
[2015-12-12 10:42:02.149959] > epoch 3, training error 33.333333 (%)
[2015-12-12 10:42:02.150215] > epoch 3, lrate 0.100000, validation error 33.333333 (%)
[2015-12-12 10:42:02.151403] > epoch 4, training error 33.333333 (%)
[2015-12-12 10:42:02.151596] > epoch 4, lrate 0.100000, validation error 33.333333 (%)
[2015-12-12 10:42:02.152745] > epoch 5, training error 33.333333 (%)
[2015-12-12 10:42:02.152934] > epoch 5, lrate 0.100000, validation error 33.333333 (%)
[2015-12-12 10:42:02.154048] > epoch 6, training error 33.333333 (%)
[2015-12-12 10:42:02.154237] > epoch 6, lrate 0.100000, validation error 33.333333 (%)
[2015-12-12 10:42:02.155377] > epoch 7, training error 33.333333 (%)
[2015-12-12 10:42:02.155566] > epoch 7, lrate 0.100000, validation error 33.333333 (%)
[2015-12-12 10:42:02.156708] > epoch 8, training error 33.333333 (%)
[2015-12-12 10:42:02.156894] > epoch 8, lrate 0.100000, validation error 0.000000 (%)
[2015-12-12 10:42:02.158023] > epoch 9, training error 0.000000 (%)
[2015-12-12 10:42:02.158214] > epoch 9, lrate 0.100000, validation error 0.000000 (%)
[2015-12-12 10:42:02.159442] > epoch 10, training error 0.000000 (%)
[2015-12-12 10:42:02.159636] > epoch 10, lrate 0.100000, validation error 0.000000 (%)
[2015-12-12 10:42:02.161165] > ... the final PDNN model parameter is dnn.mdl
[2015-12-12 10:42:02.161569] > ... the final PDNN model config is dnn.cfg

Hope it helps.

yajiemiao / pdnn

Problems creating new datafiles #28

!/bin/bash

two variables you need to set

export environment variables

TRAIN DNN