notadamking / RLTrader

A cryptocurrency trading environment using deep reinforcement learning and OpenAI's gym
https://discord.gg/ZZ7BGWh
GNU General Public License v3.0
1.71k stars 537 forks source link

Obtaining baseline results? #80

Closed TheDoctorAI closed 5 years ago

TheDoctorAI commented 5 years ago

I usually try to obtain a baseline of what's described in the paper before moving into the code. I went with the defaults and in the end, wound up with this:

--------------------------------------
| approxkl           | 0.0           |
| clipfrac           | 0.0           |
| explained_variance | nan           |
| fps                | 24            |
| n_updates          | 61            |
| policy_entropy     | 2.4535899e-05 |
| policy_loss        | 0.0           |
| serial_timesteps   | 976           |
| time_elapsed       | 38.1          |
| total_timesteps    | 976           |
| value_loss         | 1.258156e-09  |
--------------------------------------
--------------------------------------
| approxkl           | 5.075286e-15  |
| clipfrac           | 0.0           |
| explained_variance | nan           |
| fps                | 25            |
| n_updates          | 62            |
| policy_entropy     | 2.4869516e-05 |
| policy_loss        | 0.0           |
| serial_timesteps   | 992           |
| time_elapsed       | 38.8          |
| total_timesteps    | 992           |
| value_loss         | 8.715636e-07  |
--------------------------------------
 2019-07-01 05:28:51,694 - lib.RLTrader - test - INFO - Testing model (PPO2__MlpPolicy__9)
Attribute Qt::AA_EnableHighDpiScaling must be set before QCoreApplication is created.
  2019-07-01 05:28:52,066 - lib.RLTrader - test - INFO - Finished testing model (PPO2__MlpPolicy__9): $-3.69
  2019-07-01 05:28:52,069 - lib.RLTrader - train - INFO - Trained 10 models

Is this normal? I see n_trials and n_eopchs are both set to 1. Should these be increase and/or anything else changed? My Titan is helping me get through the process faster so I have some room to experiment. I wasn't monitoring it the entire time but when I was, I didn't see very much CPU/GPU usage. That's a separate issue though.

notadamking commented 5 years ago

The article has since been updated. The profits described were the result of a sorting bug, causing look-ahead bias. This project is currently undergoing a massive over-haul to improve the success of the agents/ease of use.