yajiemiao / pdnn

PDNN: A Python Toolkit for Deep Learning. http://www.cs.cmu.edu/~ymiao/pdnntk.html
Apache License 2.0
224 stars 105 forks source link

Accuratness problem #15

Closed hemmingstein closed 9 years ago

hemmingstein commented 9 years ago

Hello, I'm trying to use pdnn for CNNs (standard call, exactly like in the description in the documentation), the programm gets to initialising the model and finetuning functions, but then I get the following error:

Traceback (most recent call last): File ".pdnn/cmds/run_CNN.py", line 88, in batch_size=cfg.batch_size) File "pdnn/models/cnn.py", line 140, in build_finetune_functions (index + 1) * batch_size]}) File "/usr/local/lib/python2.7/dist-packages/Theano-0.7.0-py2.7.egg/theano/compile/function.py", line 266, in function profile=profile) File "/usr/local/lib/python2.7/dist-packages/Theano-0.7.0-py2.7.egg/theano/compile/pfunc.py", line 511, in pfunc on_unused_input=on_unused_input) File "/usr/local/lib/python2.7/dist-packages/Theano-0.7.0-py2.7.egg/theano/compile/function_module.py", line 1466, in orig_function defaults) File "/usr/local/lib/python2.7/dist-packages/Theano-0.7.0-py2.7.egg/theano/compile/function_module.py", line 1339, in create defaults, self.unpack_single, self.return_none, self) File "/usr/local/lib/python2.7/dist-packages/Theano-0.7.0-py2.7.egg/theano/compile/function_module.py", line 338, in init c.value = value File "/usr/local/lib/python2.7/dist-packages/Theano-0.7.0-py2.7.egg/theano/gof/link.py", line 345, in set self.storage[0] = self.type.filter(value, **kwargs) File "/usr/local/lib/python2.7/dist-packages/Theano-0.7.0-py2.7.egg/theano/tensor/type.py", line 164, in filter raise TypeError(err_msg, data) TypeError: ('TensorType(float32, scalar) cannot store accurately value 0.0001, it would be represented as 9.99999974738e-05. If you do not mind this precision loss, you can: 1) explicitly convert your data to a numpy array of dtype float32, or 2) set "allow_input_downcast=True" when calling "function".', 0.0001, 'Container name "learning_rate"')

It's a bit cryptic to me, so I have no idea what to do (although I understand that the problem is that values cannot be stored the right way). Can you help me please?

MaigoAkisame commented 9 years ago

It seems that you're not storing the value 0.0001 in a numpy array of dtype float32.

You may need to check your input data files (or the code that generates them) to find where this 0.0001 is.

hemmingstein commented 9 years ago

Thanks for your response! The funny point is that there is no 0.0001 in all my input data, at least before I use the pfile-script to convert the data to pfile, but pfile shouldn't change the values, so that'S somehow strange.

MaigoAkisame commented 9 years ago

Oh I see! In the CNN code the default learning rate is set to 0.0001, and that is what's causing the warning.

My experience is that a learning rate of 0.0001 may be too small; so I suggest that you set it to a larger value (say 0.1) instead of using the default value.

ghost commented 9 years ago

In models/cnn.py, change "theano.Param(learning_rate, default = 0.0001)" to "theano.Param(learning_rate, default = 0.1)" should solve this problem. This value doesn't matter anyway, since it will be overwritten by the learning rate specified by the user or by the default config.

hemmingstein commented 9 years ago

Hello again, I tried both ways, now the problem is that 0.1 also cannot be stored accurately:

"TypeError: ('TensorType(float32, scalar) cannot store accurately value 0.1, it would be represented as 0.10000000149. If you do not mind this precision loss, you can: 1) explicitly convert your data to a numpy array of dtype float32, or 2) set "allow_input_downcast=True" when calling "function".', 0.1, 'Container name "learning_rate"')"

Maybe it's a compiler problem or something, so since the difference between 0.1 and 0.10000000149 does not really matter I would follow the way suggested by the compiler to set "allow_input_downcast=True", but I'm not sure where this should happen. Do you have an idea?

MaigoAkisame commented 9 years ago

I suggest that you always specify the learning rate on the command line, like --lrate "C:0.1:5" (use the constant learning rate of 0.1 for 5 epochs), instead of using the default value.

If you really want to use the default value, you can change it to something like 0.125, which follows the form of 1/2^n.

hemmingstein commented 9 years ago

I changed the default value to 0.125 and --lrate "C:0.1:5" to --lrate"C:0.125:5". Getting the finetuning functions seems to work now.

But now I get another error in step "finetuning the model". I'm sorry to bother you with that, but I would really like to try out CNNs, and your tool obviously works out just fine for other people. So here's the error: "ValueError: total size of new array must be unchanged Apply node that caused the error: Reshape{4}(Subtensor{int64:int64:}.0, TensorConstant{[256 1 28 28]}) Inputs types: [TensorType(float64, matrix), TensorType(int64, vector)] Inputs shapes: [(256, 40), (4,)] Inputs strides: [(320, 8), (8,)] Inputs values: ['not shown', array([256, 1, 28, 28])]"

Or shall I make a new issue for that?

ghost commented 9 years ago

Yes, I think submitting a new issue would be better. We will follow up on that thread.