ShangtongZhang / DeepRL

Modularized Implementation of Deep RL Algorithms in PyTorch
MIT License
3.21k stars 684 forks source link

[Question] VecEnv implementation #80

Closed bycn closed 4 years ago

bycn commented 4 years ago

Hi, It seems that for vectorized environments, the design for this library (and others) is to sample as following: if n = # environments, one sample is stored as a (n x obs_size) tuple into the replay buffer, and the model consumes the n-tuple. Why is it done this way, as opposed to storing the n-tuple as n separate tuples, and having the model consume one obs at a time? Thanks!

Bryan

ShangtongZhang commented 4 years ago

I think there isn't any particular reason for that implementation .. Anyway we will sample minibatch from the buffer.