Closed Uptomylimit closed 1 year ago
You can refer to this config for the runnable mbpo example. If other questions, you can continue to ask questions in this issue.
I am using this config "dizoo.classic_control.pendulum.config.mbrl.pendulum_sac_mbpo_config". Is this config a runnable example? I am confused about the structure of using model based RL. I find that you have many examples in "./ding/example" about model free RL, like sac,ppo,dqn,etc. But there is no example of mbrl. So I am confused about the structure of using model based RL. Is there a example about model based method?
I use the MBSACPolicy class in "./ding/policy/mbpolicy". But I find that there is error like this:
'EasyDict' object has no attribute 'lambda_'
Which means that the config is not complete. The config you present doesn't include the setting "grad_clip".I think that it maybe not complete. So I add these settings. But I am still confused of the structure of using model based RL. How to use the world model? How to use the function "task.use(StepCollector)"?
I am using this config "dizoo.classic_control.pendulum.config.mbrl.pendulum_sac_mbpo_config". Is this config a runnable example? I am confused about the structure of using model based RL. I find that you have many examples in "./ding/example" about model free RL, like sac,ppo,dqn,etc. But there is no example of mbrl. So I am confused about the structure of using model based RL. Is there a example about model based method?
You can directly execute python3 pendulum_sac_mbpo_config.py
to run this config file. This file will call serial_pipeline_dyna
method to launch the training problem.
Thanks for your answer! The code works now!
I am tring to use DI-engine to realize mbpo. But I get a bug.
The error is as follow:
Could you tell me how to use your model based reinforcement learning method like mbpo? IS there a example about model based method?