notadamking / RLTrader

A cryptocurrency trading environment using deep reinforcement learning and OpenAI's gym
https://discord.gg/ZZ7BGWh
GNU General Public License v3.0
1.71k stars 537 forks source link

Results vary wildly between hourly and daily datasets #58

Closed mprestonise closed 5 years ago

mprestonise commented 5 years ago

Is this happening for anyone else?

The hourly dataset returns a ~1000% profit, but when I run the trained agent against the daily dataset I see a return of -15-20%

It doesn't appear to be a code issue as everything runs correctly, but I'm curious if anyone else is seeing this variance in performance

TheExGenesis commented 5 years ago

Check issue #28 to see if you get the same results after fixing the order of the dates

mprestonise commented 5 years ago

check issue #28

@TheExGenesis Isn't that solved by fixing line 21 of the BitcoinTradingGraph? e.g. lambda x: datetime.strptime(x, '%Y-%m-%d'))

TheExGenesis commented 5 years ago

You use this line df['Date'] = pd.to_datetime(df['Date'], format='%Y-%m-%d %I-%p') Before the date sort on train.py and test.py

TheExGenesis commented 5 years ago

Also can I ask how many sessions you ran and from which commit you took the code? I'm not able to replicate the results even with the old data.

mprestonise commented 5 years ago

@TheExGenesis I cloned and then made my own changes, but the changes I made didn't affect that part of the code.. I only modified the graph visualizer to handle the Date / Time string as a day rather than the default hourly format.

I get all the results predicted by the article, but the daily data set returns a negative return (loss).

I don't have the hardware to run the intensive optimize.py function, so I used the pre-trained models from one of the other issues. What I'm working on right now is how to transition the agent from being trained on a .csv file to running against a live data stream via API call to the CryptoCompare API as a way to "paper trade" and benchmark the actual performance of several models over the same time series (I am currently mining data every 30 seconds and saving it down to .csv files). I'm still new to Python though, so I haven't learned yet how to get a real-time stream into a data frame to pass it into the BitcoinTradingEnv.

Sorry for the ramble.