Closed matteobettini closed 1 year ago
For the last one (_reset should know the batch size) we could just pass an empty TensorDict instance. Wdyt?
For the last one (_reset should know the batch size) we could just pass an empty TensorDict instance. Wdyt?
There might be use cases where only some of the dimensions of the vector have to be reset. For example, the done flag can state that only some simulations in the vector have to be reset.
This is why methods such as reset_at()
exist in rllib VectorEnv
(https://github.com/ray-project/ray/blob/master/rllib/env/vector_env.py#L104)
We have that in ParallelEnv through a "resent_workers" key IIRC. We could make a reset_at helper that writes the Boolean mask in the tensordict.
We have that in ParallelEnv through a "resent_workers" key IIRC. We could make a reset_at helper that writes the Boolean mask in the tensordict.
Exactly, a key like that can be used in the reset()
function of BaseEnv
and instead of limiting to the worker dimensions it spans over all the batch_dim of the env.
If this key is not present the default could be reset all dims
Motivation
Vectorized environments are environments that perform simulations using batches. This can be useful to benefit from parallel computation on GPUs. These environments have their own batch_sizes, which can be used for different reasons.
For example:
(n_vectorized_envs, obs_size)
(n_vectorized_envs, n_agents, obs_size)
Currently, torchrl environment infrastructure has some issues with environemnts which have non-empty batch sizes or that have a batch dimension for agents.
Ideally, we would like to use vectorized environments freely in torch rl and leverage its features such as
ParallelEnv
andCollectors
on top of such environments. This whould create tensordicts with many dimensions in the batch_size, for example:I created this issue to list and organize all the issues that need to be addressed in order to generalize to
BaseEnv
s with general batch sizes in torchrl:Issues
Stacking tensordicts of hetergoeneous shapes and nestedtensors compatibility (#766)(PR)
When some of the dimensions of the vectorized enironment are heterogenous (agents with different observation and action spaces that stil share the other batch dimensions), we need to carry this heterogeneous data in a suitable data straucture.
NestedTensors provide a natural candidate for this task. Here is a list of the operations that need to be supported by NestedTensors in order to enable this feature:
[[a, b], [a, c]]
into a single one of shape[[[a, b], [a, c]], [[a, b], [a, c]]]
)Heterogeneous
CompositeSpec
(#766)(PR #829)Bug on how
ParallelEnv
sets thebatch_size
(#773)(PR #774)Bug on using
sorted()
onCompositeSpec
keys (#775)(PR #787)Hangling of the done flag when it has arbitrary dimensions (#776)(PR #788)
The
_reset()
method needs to be able to know which dimensions and indexes to reset (#790)(PR #800)Collectors crash with enviornments with non-empty batch_size (#807)(PR #828)