Wrong RNN Implementation

XMaster96 commented 6 years ago

He, I really enjoy, your Videos, tutorials and Humor. But you made a Error, in one of your Tensorflow tutorials (23Time-Series-Prediction). If you want to keep the Hidden state, you have to set it to stateful to True. If it is False, the Hidden state will by reset to zero every time, you have a output. It is not willy mentioned anywhere, and it is becurs the RNN implamentasion from cuDNN and Tensorflow is based on cuDNN.

Hvass-Labs commented 6 years ago

Thanks for the compliment.

It was over 4 months ago that I did that tutorial and I can't remember the details. But as I recall, you need to reset the internal state for each batch during training, because they start at different positions in time, so you cannot keep the internal state between them as that would distort the predictions. If you instruct Keras / TensorFlow to not reset the state automatically, then you need to do it yourself before each batch.

During inference / prediction it is sometimes useful to keep the internal state without resetting it, e.g. if you want to feed a new data-point every hour or so, without having to feed many leading data-points again to "warm up" the recurrent units. But you would have to modify the tutorial code to do that. It is not really a bug in the tutorial, it is just meant to be used with a warm-up period.

XMaster96 commented 6 years ago

The Problem is that Keras views each, hour of your water data as a new batch, and resets the Hidden state. That means that your model has no real time sensitive prediction. It is in nearly any tutorial, you can find on the Internet wrong. The only one that does it right (I have found so far) is This: https://machinelearningmastery.com/understanding-stateful-lstm-recurrent-neural-networks-python-keras/

And the worst is that is nowhere mentioned in the official Documentation.

Hvass-Labs commented 6 years ago

I am not convinced that there is a problem here as I explained above.

I get these comments somewhat regularly from anonymous internet users and often they are very poorly explained and in a broken English, that something is wrong in my code. Then people eventually realize that they were actually wrong themselves and I've wasted a ton of time investigating the "alleged problem".

From Keras' manual https://keras.io/layers/recurrent/#gru

stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch.

I agree that this is poorly explained. But as I understand, stateful==True means that the state is preserved between batches of data and must be manually reset to zero before feeding a new batch during training. As I explained above, it would be useful to have stateful==True during inference if you want to incrementally process a single data-point without having to "warm up" the recurrent unit by feeding a large batch of preceding values.

If you still believe I've made a mistake then please give some solid evidence. For example, you can modify the Notebook with how you think it should be done instead and show me that the predicted results are better that way. Then I would investigate deeper whether I really made a mistake.

Hvass-Labs / TensorFlow-Tutorials

Wrong RNN Implementation #79