Data mixed from different parallel environments

lcswillems / rl-starter-files

RL starter files in order to immediately train, visualize and evaluate an agent without writing any line of code

MIT License

649 stars 183 forks source link

https://github.com/lcswillems/torch-rl/blob/c33bf422aad70be89498fc712a7bed56aa2512aa/torch_rl/torch_rl/algos/base.py#L126

I think the data from different environments are getting mixed here. the "preprocessed_obs" seems to receive observations from the parallel environment and then it gets forwarded to the model. My understanding was observations from a specific environment should only get to the model and then based on the model's prediction you would select an action for that specific environment. But it seems all the observations from the parallel environments are forwarded to the model. please correct me if I am wrong.

lcswillems / rl-starter-files

Data mixed from different parallel environments #20