carrilloric / CS230

Generative Adversarial Network for Stock Market Price Prediction
55 stars 28 forks source link

Data #1

Open personal-coding opened 4 years ago

personal-coding commented 4 years ago

Are you inverting the data structure? The way Quandl downloads data is new data first. From what I can tell, you're not inverting your data so that the newest data is the last index in the dataframe. Because of that, it appears you're predicting older data and training newer data.

carrilloric commented 4 years ago

Hi Yuri, please check Quandl_TimeSeries_Pre_data.ipynb at the end of the notebook, when we generate the file symbol_premium.

personal-coding commented 4 years ago

The Quandl_TimeSeries_Pre_data.ipynb looks fine, but I don't see that you're using that data in GAN_stock_m_prediction.ipynb.

I see now that you're sorting the data by timestamp in GAN_stock_m_prediction.ipynb via df = df.sort_values(by='timestamp') . However, you're using a negative shift with the rolling windows:

df = ((df -
df.rolling(num_historical_days).mean().shift(-num_historical_days))
/(df.rolling(num_historical_days).max().shift(-num_historical_days)
-df.rolling(num_historical_days).min().shift(-num_historical_days)))

Doesn't this provide look ahead data into the rolling window?