deshpandenu / Time-Series-Forecasting-of-Amazon-Stock-Prices-using-Neural-Networks-LSTM-and-GAN-

Project analyzes Amazon Stock data using Python. Feature Extraction is performed and ARIMA and Fourier series models are made. LSTM is used with multiple features to predict stock prices and then sentimental analysis is performed using news and reddit sentiments. GANs are used to predict stock data too where Amazon data is taken from an API as Generator and CNNs are used as discriminator.
399 stars 121 forks source link

data leakage #2

Open PietroAmin opened 4 years ago

PietroAmin commented 4 years ago

Please note that there is the possibility of data leakage. The way data are standardize is very dangerous because you shift back in time future information. Just try to delete (.shift(-num_historical_days)) in your scaling method and you will see how results will get worser.

PietroAmin commented 4 years ago

There is the same problem in numerous github codes that try to forecast stock's future prices with GAN.

nupurdeshpande11 commented 4 years ago

There is the same problem in numerous github codes that try to forecast stock's future prices with GAN.

There is the same problem in numerous github codes that try to forecast stock's future prices with GAN.

Cool...I'll check out what you are talking about...what kinda data leakage exactly? Plus the gans are experimental since they haven't been used extensively for time series...

PietroAmin commented 4 years ago

Just try to visualize the data. When you calculate the moving average, min and max at time (t) and then moving back those informations to "num_historical_days" times before you are anticipating those information. Indeed, if you visualize you will see the moving average always predictig the path of the real time series

PietroAmin commented 4 years ago

I'm trying to construct the GAN with LSTM as generator and CCN as generator :)

nupurdeshpande11 commented 4 years ago

The ma has a weight added to it for this reason

On Thu, Mar 26, 2020, 3:51 PM PietroAmin notifications@github.com wrote:

Just try to visualize the data. When you calculate the moving average, min and max at time (t) and then moving back those informations to "num_historical_days" times before you are anticipating those information. Indeed, if you visualize you will see the moving average always predictig the path of the real time series

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/deshpandenu/Time-Series-Forecasting-of-Amazon-Stock-Prices-using-Neural-Networks-LSTM-and-GAN-/issues/2#issuecomment-604649631, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHUXKH5SRGBGBMPXJS4YUJDRJOW5FANCNFSM4LUO2R3Q .

PietroAmin commented 4 years ago

Linear Scaling x ′ = ( x − x (mean)) / ( x m a x − x m i n ) Just give me some link where they explain why the shift is needed

nupurdeshpande11 commented 4 years ago

Sure...I'll have to look at my refs and let u know

On Thu, Mar 26, 2020, 7:40 PM PietroAmin notifications@github.com wrote:

Linear Scaling x ′ = ( x − x (mean)) / ( x m a x − x m i n ) Just give me some link where they explain why the shift is needed

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/deshpandenu/Time-Series-Forecasting-of-Amazon-Stock-Prices-using-Neural-Networks-LSTM-and-GAN-/issues/2#issuecomment-604741398, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHUXKH7Z33VOPZDDMLPKMXTRJPRW5ANCNFSM4LUO2R3Q .

yanbigong2 commented 4 years ago

I'm trying to construct the GAN with LSTM as generator and CCN as generator :)

Have you finished this code? Will you open it?

deshpandenu commented 3 years ago

Please note that there is the possibility of data leakage. The way data are standardize is very dangerous because you shift back in time future information. Just try to delete (.shift(-num_historical_days)) in your scaling method and you will see how results will get worser.

Can you please explain this problem by typing in the equation and code you are referring to. Thank you