Closed glistering96 closed 1 year ago
[async_debug] Exp using t [async_debug] t time: 0.9313950538635254 [async_debug] Exp using s [async_debug] s time: 0.32047533988952637 [async_debug] Exp using SyncVectorEnv [async_debug] SyncVectorEnv time: 0.34770822525024414
This is the experiment results on various vector env methods.
't' is the case using the threading library 's' is the case using naive sequential for loop 'SyncVectorEnv' is the case using the Gymnasium's prebuilt library.
Managing the pool from process library modules is awful, as it needs a lot of resources to produce process context. Also, there seems to be a problem in which the processed result from the processes is not applied to the environment module. So, when calling a step method using multiprocessing, it does return proper outputs but the field variables from each environment are not applied.
According to the result above, VecEnv seems to be good for the environment.
[async_debug] Exp using t [async_debug] t time: 3.6767070293426514 [async_debug] Exp using s [async_debug] s time: 1.6217741966247559 [async_debug] Exp using SyncVectorEnv [async_debug] SyncVectorEnv time: 1.5066163539886475 [async_debug] Exp using VecEnv [async_debug] VecEnv time: 1.4272098541259766
This is the result from N=100 case. It seems like VecEnv, which is the previous implementation is the best
Maybe working on a numpy based environment when training can be another choice for the solution
For faster environment transitions, need to implement the VecEnv with async VecEnv