notadamking / RLTrader

A cryptocurrency trading environment using deep reinforcement learning and OpenAI's gym
https://discord.gg/ZZ7BGWh
GNU General Public License v3.0
1.71k stars 537 forks source link

MlpLstmPolicy not used in the last release #93

Closed q-55555 closed 5 years ago

q-55555 commented 5 years ago

The BasePolicy used in the last release is MlpPolicy whereas in the great tutorial https://towardsdatascience.com/using-reinforcement-learning-to-trade-bitcoin-for-massive-profit-b69d0e8f583b, the benefits of MlpLstmPolicy seem to be very important in the results.

raybotha commented 5 years ago

After further research, the results were due to a bug causing look-ahead bias. Check out the discord for more information. Someone got better results with the MlpPolicy, but it's worth comparing both again.

maxmatical commented 5 years ago

It looks like though the environment is still using the last value in the dataframe. Is this intended, or do you plan on going back to using lookback_window_size? I created an environment based on previous versions that incorporates both lstm and mlp policies where the mlp policy incorporates the previous lookback_window_size observations into account.

After further research, the results were due to a bug causing look-ahead bias. Check out the discord for more information. Someone got better results with the MlpPolicy, but it's worth comparing both again.

notadamking commented 5 years ago

The lookback_window will likely not be coming back, as it can be replaced by stable-baselines VecEnvStack, and it is unnecessary for an LSTM model.

As for the LSTM policy, you are welcome to pass in policy: MlpLstmPolicy to RLTrader() to experiment with it today, otherwise, we are currently working on improving the success of the model and as such the defaults of this library are subject to change when a better default is found.