Closed amihos closed 2 years ago
Can you try turning down the "learning_rate" from 0.00003 to 0.00001, and run again? I noticed that this happens with single stock trading, mostly happened with DDPG, TD3, and SAC, never happened to PPO or A2C.
Nothing changed looks like:
Now that you mentioned I checked again the results from other tests and I confirm that the same thing happened for test DDPG but not TD3!
Hi i have the same problem as well, is there any way to solve this?
We updated a lot of Codes. This issue does not exist now.
I'm still getting the same issue, where the total_trades, assets and reward remains the same.
in FinRL_single_stock_trading.ipynb
after finishing training Model 4 Sac I am getting this:
Model 4: SAC agent = DRLAgent(env = env_train) SAC_PARAMS = { "batch_size": 128, "buffer_size": 100000, "learning_rate": 0.00003, "learning_starts": 100, "ent_coef": "auto_0.1", } model_sac = agent.get_model("sac",model_kwargs = SAC_PARAMS) {'batch_size': 128, 'buffer_size': 100000, 'learning_rate': 3e-05, 'learning_starts': 100, 'ent_coef': 'auto_0.1'} Using cpu device trained_sac = agent.train_model(model=model_sac, tb_log_name='sac', total_timesteps=30000) Logging to tensorboard_log/sac/sac_2
| time/ | | | episodes | 4 | | fps | 81 | | time_elapsed | 123 | | total timesteps | 10064 | | train/ | | | actor_loss | -940 | | critic_loss | 1.6 | | ent_coef | 0.135 | | ent_coef_loss | 19 | | learning_rate | 3e-05 | | n_updates | 9963 |
day: 2515, episode: 100 begin_total_asset:100000.00 end_total_asset:100000.00 total_reward:0.00 total_cost: 0.00 total_trades: 0
| time/ | | | episodes | 8 | | fps | 80 | | time_elapsed | 250 | | total timesteps | 20128 | | train/ | | | actor_loss | -510 | | critic_loss | 147 | | ent_coef | 0.182 | | ent_coef_loss | 15.8 | | learning_rate | 3e-05 | | n_updates | 20027 |