AI4Finance-Foundation / FinRL

FinRL: Financial Reinforcement Learning. 🔥
https://ai4finance.org
MIT License
10.09k stars 2.43k forks source link

I got an error with DRLEnsembleAgent.run_ensemble_strategy(). It misses timesteps_dict but it is in the code #1202

Open manu1dcb opened 7 months ago

manu1dcb commented 7 months ago

Describe the bug I am working on this notebook and, when I run this code df_summary = ensemble_agent.run_ensemble_strategy(A2C_model_kwargs, PPO_model_kwargs, DDPG_model_kwargs, TD3_model_kwargs, timesteps_dict) It gives me this error

**TypeError:** DRLEnsembleAgent.run_ensemble_strategy() missing 1 required positional argument: 'timesteps_dict' Does anybody know how can I fix it?

RGIIST commented 7 months ago

replace previous cell's content by

A2C_model_kwargs = {
                    'n_steps': 5,
                    'ent_coef': 0.005,
                    'learning_rate': 0.0007
                    }

PPO_model_kwargs = {
                    "ent_coef":0.01,
                    "n_steps": 2048,
                    "learning_rate": 0.00025,
                    "batch_size": 128
                    }

DDPG_model_kwargs = {
                      #"action_noise":"ornstein_uhlenbeck",
                      "buffer_size": 10_000,
                      "learning_rate": 0.0005,
                      "batch_size": 64
                    }

SAC_model_kwargs = {
    "batch_size": 64,
    "buffer_size": 100000,
    "learning_rate": 0.0001,
    "learning_starts": 100,
    "ent_coef": "auto_0.1",
}

TD3_model_kwargs = {"batch_size": 100, "buffer_size": 1000000, "learning_rate": 0.0001}

timesteps_dict = {'a2c' : 10_000,
                 'ppo' : 10_000,
                 'ddpg' : 10_000,
                 'sac' : 10_000,
                 'td3' : 10_000
                 }

It means you need to add SAC and TD3 kwargs. and then run

df_summary = ensemble_agent.run_ensemble_strategy(A2C_model_kwargs,
                                                 PPO_model_kwargs,
                                                 DDPG_model_kwargs,
                                                 SAC_model_kwargs,
                                                 TD3_model_kwargs,
                                                 timesteps_dict)
manu1dcb commented 7 months ago

replace previous cell's content by

A2C_model_kwargs = {
                    'n_steps': 5,
                    'ent_coef': 0.005,
                    'learning_rate': 0.0007
                    }

PPO_model_kwargs = {
                    "ent_coef":0.01,
                    "n_steps": 2048,
                    "learning_rate": 0.00025,
                    "batch_size": 128
                    }

DDPG_model_kwargs = {
                      #"action_noise":"ornstein_uhlenbeck",
                      "buffer_size": 10_000,
                      "learning_rate": 0.0005,
                      "batch_size": 64
                    }

SAC_model_kwargs = {
    "batch_size": 64,
    "buffer_size": 100000,
    "learning_rate": 0.0001,
    "learning_starts": 100,
    "ent_coef": "auto_0.1",
}

TD3_model_kwargs = {"batch_size": 100, "buffer_size": 1000000, "learning_rate": 0.0001}

timesteps_dict = {'a2c' : 10_000,
                 'ppo' : 10_000,
                 'ddpg' : 10_000,
                 'sac' : 10_000,
                 'td3' : 10_000
                 }

It means you need to add SAC and TD3 kwargs. and then run

df_summary = ensemble_agent.run_ensemble_strategy(A2C_model_kwargs,
                                                 PPO_model_kwargs,
                                                 DDPG_model_kwargs,
                                                 SAC_model_kwargs,
                                                 TD3_model_kwargs,
                                                 timesteps_dict)

Thanks Ravi! This gives me another issue:

`--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[35], line 1 ----> 1 df_summary = ensemble_agent.run_ensemble_strategy(A2C_model_kwargs, 2 TD3_model_kwargs, 3 PPO_model_kwargs, 4 DDPG_model_kwargs, 5 SAC_model_kwargs, 6 timesteps_dict)

File ~\anaconda3\envs\RLFinance2\lib\site-packages\finrl\agents\stablebaselines3\models.py:577, in DRLEnsembleAgent.run_ensemble_strategy(self, A2C_model_kwargs, PPO_model_kwargs, DDPG_model_kwargs, SAC_model_kwargs, TD3_model_kwargs, timesteps_dict) 572 # print("training: ",len(data_split(df, start=20090000, end=test.datadate.unique()[i-rebalance_window]) )) 573 # print("==============Model Training===========") 574 # Train Each Model 575 for model_name in MODELS.keys(): 576 # Train The Model --> 577 model, sharpe_list, sharpe = self._train_window( 578 model_name, 579 kwargs[model_name], 580 model_dct[model_name]["sharpe_list"], 581 validation_start_date, 582 validation_end_date, 583 timesteps_dict, 584 i, 585 validation, 586 turbulence_threshold, 587 ) 588 # Save the model's sharpe ratios, and the model itself 589 model_dct[model_name]["sharpe_list"] = sharpe_list

File ~\anaconda3\envs\RLFinance2\lib\site-packages\finrl\agents\stablebaselines3\models.py:371, in DRLEnsembleAgent._train_window(self, model_name, model_kwargs, sharpe_list, validation_start_date, validation_end_date, timesteps_dict, i, validation, turbulence_threshold) 368 return None, sharpe_list, -1 370 print(f"======{model_name} Training========") --> 371 model = self.get_model( 372 model_name, self.train_env, policy="MlpPolicy", model_kwargs=model_kwargs 373 ) 374 model = self.train_model( 375 model, 376 model_name, (...) 379 total_timesteps=timesteps_dict[model_name], 380 ) # 100_000 381 print( 382 f"======{model_name} Validation from: ", 383 validation_start_date, 384 "to ", 385 validation_end_date, 386 )

File ~\anaconda3\envs\RLFinance2\lib\site-packages\finrl\agents\stablebaselines3\models.py:216, in DRLEnsembleAgent.get_model(model_name, env, policy, policy_kwargs, model_kwargs, seed, verbose) 212 temp_model_kwargs["action_noise"] = NOISE[ 213 temp_model_kwargs["action_noise"] 214 ](mean=np.zeros(n_actions), sigma=0.1 * np.ones(n_actions)) 215 print(temp_model_kwargs) --> 216 return MODELS[model_name]( 217 policy=policy, 218 env=env, 219 tensorboard_log=f"{config.TENSORBOARD_LOG_DIR}/{model_name}", 220 verbose=verbose, 221 policy_kwargs=policy_kwargs, 222 seed=seed, 223 **temp_model_kwargs, 224 )

TypeError: DDPG.init() got an unexpected keyword argument 'ent_coef' `

What's the problem? thanks!!

RGIIST commented 7 months ago

https://colab.research.google.com/drive/1h6ZDbUi5PgYHSGKJXg7djhKUMafFvHMG#scrollTo=tiFA3pRWPQMO&uniqifier=1

This is the link to my copy of the same Jupyter file. It is running smoothly and training is going on. Have a look at the code, this might help you.