lcswillems / rl-starter-files

RL starter files in order to immediately train, visualize and evaluate an agent without writing any line of code
MIT License
649 stars 183 forks source link

Data mixed from different parallel environments #20

Closed asiffiiqbal closed 5 years ago

asiffiiqbal commented 5 years ago

https://github.com/lcswillems/torch-rl/blob/c33bf422aad70be89498fc712a7bed56aa2512aa/torch_rl/torch_rl/algos/base.py#L126

I think the data from different environments are getting mixed here. the "preprocessed_obs" seems to receive observations from the parallel environment and then it gets forwarded to the model. My understanding was observations from a specific environment should only get to the model and then based on the model's prediction you would select an action for that specific environment. But it seems all the observations from the parallel environments are forwarded to the model. please correct me if I am wrong.

lcswillems commented 5 years ago

Yes, all the observations from the parallel environments are forwarded to the model. What is the problem?

The model chooses an action from an observation and its previous state. So, I can put in the model all the observations of the parallel envs along with all the corresponding previous states.

I close this issue because I don't understand the problem and don't have enough information to identify it. However, we can continue the discussion and I might open this issue again.