Open fc524079318 opened 2 years ago
ExaCH env hasn't been maintained in a while, but is this specific to ExaCH or a general problem? If it can be fixed by using >=
instead of ==
, do you mind issuing a PR?
@fc524079318 can you add the configuration files you ran and the command line. Thanks!
@Jodasue I use the learner_cfg.json like
{
"agent": "DQN-v0",
"env": "ExaCH-v0",
"workflow": "sync",
"n_episodes": 10,
"n_steps": 10,
"model": "MLP",
"output_dir": "./results_dir/",
"process_per_env": 1,
"log_level": [3, 3],
"log_frequency": 1,
"profile": "None"
}
and the command line is
mpiexec -n 4 python start.py --workflow async
the start.py is likes EXARL/exarl/driver/main.py
I think it may be a general problem since the error is in sync_learner.py . If I set n_steps to 10,I think it means do 10 step per episode,but it run 11 step before done.I see the done check is before update of self.steps,and I try to run ExaCartPoleStatic with async workflow,I see the self.steps increases to 11 before the episode ends.
I changed ==
to >=
but it doesn't work.May be I should change self.steps == exalearner.nsteps:
to self.steps == exalearner.nsteps-1:
?
Yes, the issue is basically C-style, 0-based counting, and your fix logic is correct. My only concern is that this is probably present in multiple learners, and maybe have been partially remedied in various ways in different places. I would say that you can implement your fix and test locally, and we will try to get a consistent remedy in the code soon.
I try to run ExaCH with async workflow and the parameter "n_steps" is 10,and I met an error.
I found the sync_learner didn't end correctly and in sync_learner.py line 568
if self.steps == exalearner.nsteps:
self.steps begin at 0 and when it get to 9,next step will be the 11th,but the nsteps is 10,so self.done will still be false.