jaungiers / LSTM-Neural-Network-for-Time-Series-Prediction

LSTM built using Keras Python package to predict time series steps and sequences. Includes sin wave and stock market data
GNU Affero General Public License v3.0
4.85k stars 1.96k forks source link

normalization issue #66

Open kerryDong11 opened 5 years ago

kerryDong11 commented 5 years ago

I have many 0 data in my train data, the function normalise_windows will have the issues: float division by zero. The issue happened at : 'normalised_col = [((float(p) / float(window[0, col_i])) - 1) for p in window[:, col_i]]'. So if i have many 0 data in my train data. How can I normalization my data?

guillaume-chevalier commented 5 years ago

The trick is to add epsilon to the STD as done here: https://github.com/guillaume-chevalier/seq2seq-signal-prediction/blob/25721c0cd8e1ff1d9310f95ccebde56d5b0c26a1/datasets.py#L135

Or use an if such that if the STD is of zero then it isn't added.

kerryDong11 commented 5 years ago

The trick is to add epsilon to the STD as done here: https://github.com/guillaume-chevalier/seq2seq-signal-prediction/blob/25721c0cd8e1ff1d9310f95ccebde56d5b0c26a1/datasets.py#L135

Or use an if such that if the STD is of zero then it isn't added.

But if i add the epsilon to the STD , the loss will be very large,like 144ms/step - loss: 1139874524.9530. Do you know what is going on at the MES caculation? Thank you so much.

guillaume-chevalier commented 5 years ago

perhaps your values aren't real zeros, but instead are near-zero values, and normalizing them makes them not numerically stable. You could clip small values to an exact zero if you need to really ignore them. Dividing a real zero by epsilon should result in having a real zero again.