-
This is a very loose roadmap for what/when major breaking changes should be expected in Gym and in what order (last updated September 6, 2022):
October:
- [ ] Wrapper overhaul
- [ ] Official Cond…
-
## Motivation
Melting Pot: https://github.com/deepmind/meltingpot is a great MARL evaluation tool for MARL research.
## Solution
Integrate Melting Pot into Env Pool
## Additional context
…
-
### Describe the bug
Hello, I used python sdk API to create a report in https://wandb.ai/costa-huang/cleanRL/reports/Atari-CleanRL-PPO-gym-vs-envpool--VmlldzoyODg0NzU2, but it cannot display the line…
-
Hi @jenkspt, I have been learning nanoGPT and reproducing it in JAX from scratch. Your repo has been a very helpful reference.
I encountered an issue with `optax.apply_every` and thought you might …
-
Hi,
The current PPO implementation does not seem to account for time limits. While the `EpisodeWrapper` from brax is used, which tracks a truncation flag ([source](https://github.com/google/brax/bl…
-
Due to the high computing power required for training, we will gradually upload data to the data hub and report the progress in this issue. We will also change the priority of training according to ne…
-
## Motivation
If I understand correctly, the speed up of envpool comes from c++ implementation as supposed to python. So, I wonder if the XLA interface will provide anymore speed up when jitted by `j…
-
I'm trying to use a reverb replay buffer with a batched environment like 'envpool' where the api returns a batch of experience whenever the either `.reset` or `.step ` is called.
I'm guessing ther…
-
Currently, in `SubprocVectorEnv`, there is a single [send](https://github.com/thu-ml/tianshou/blob/2336a7db1b7ed2d27ef09462f4084f5a45daa008/tianshou/env/worker/subproc.py#L189) method to do both `rese…
-
### 🚀 Feature
Check the environment when creating a `VecEnv`
### Motivation
I noticed that [`check_env`](https://github.com/DLR-RM/stable-baselines3/blob/2bb8ef5e632a0e0dda291c2cd6735da75a4fc…