pplonski / keras2cpp

This is a bunch of code to port Keras neural network model into pure C++.
MIT License
679 stars 153 forks source link

Skip dropout rate error #49

Closed jvwilliams23 closed 1 year ago

jvwilliams23 commented 2 years ago

Hi,

I have noticed a potential issue in the following code: https://github.com/pplonski/keras2cpp/blob/ce407cc06ca9886c330c1bf0e152058befcb60bb/keras_model.cc#L431-L433

Are you sure that we do not need to include dropout layer in prediction mode? In Figure 2 of Srivastava et al. (2014), they say that in training, the weights are randomly set to 0 with probably equal to the dropout rate. In prediction mode, the dropout rate is still there but is simply multiplied to all weights in the layer - which disagrees with the code.

Additionally, I have noticed major differences in my python keras models vs keras2cpp models with dropout when using the default keras_model.cc. Then, when the weights are multiplied by dropout rate, the error goes away.

Reference Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R., 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15(1), pp.1929-1958.

pplonski commented 2 years ago

@jvwilliams23 when I was testing the code the predictions were exactly the same as in keras. I dont think that dropout is used in prediction.

jvwilliams23 commented 2 years ago

@pplonski Interesting, I will look more into my code. Were you testing using the mnist example?

pplonski commented 2 years ago

Yes, with mnist data.

jvwilliams23 commented 1 year ago

Hi @pplonski I just got around to looking into this further. It seems keras do not use dropout in prediction (https://github.com/keras-team/keras/blob/dc95ceca57cbfada596a10a72f0cb30e1f2ed53b/keras/layers/core.py#L116). I guess this is consistent then, but strange that it goes against the original paper. I will close this issue.