Closed DH-O closed 1 year ago
Hi @DH-O,
num_rollouts
parallel environments with the same policy network to collect data faster.I hope this answers your questions:)
@DH-O Can I close this comment now?
@nsidn98 Oops I forgot to close it. thank you for all your help :)
Hi @nsidn98 Thank you for all your kind responses. Those have helped me a lot.
I successfully got train results, but I wonder that does the environment reset after the end of the episode. In my opinion, after the end of the episode, the environment should be reset.
According to the code below, which is from
graph_mpe_runner.py
,env.reset
only activates only once when `run' is called.I wonder if is it okay env.reset activates once for the whole training, even though there are multiprocessing of 128(=
--num_rollout_threads
) running. if it is so, then only 128 rollouts are done without an environment reset.The answers that I want to hear are as follows
Thank you!