-
Formula for off_policy_method:
total_timeseteps= n_epochs * n_epoch_cycles * batch_size
then if
n_epochs=1400
n_epoch_cycles=20
batch_size=64
min_buffer_size=10^6
then total_timesteps=140…
-
Hi,
I am trying to use Arena in my research project. I have several general questions:
1) The [baseline tutorial videos ](https://sites.google.com/view/arena-unity/home/tutorials-baselines?authu…
-
### What happened + What you expected to happen
# What happened
I ran `PPO` with `RLModule` and `_enable_learner_api=True` using `framework="tf2"`.
The following error occurred:
```
Failu…
-
Hello,
First, let me thank you for open-sourcing this great framework. However, I am unable to run the training without getting the following error:
```
Traceback (most recent call last):
Fi…
-
### Question
**TL;DR: do you have baselines for performance on the environments using some popular MARL algorithm, say MADDPG or other?**
Hi there, first of all, thanks for maintaining MAMuJoCo. I…
-
Hello,
I am trying to benchmark your code on more tasks from deepmind/* but they are not working. There seems to be a bug in the `prepare_obs` function in `sbx/common/policies.py`. I attach stack tra…
-
First, thanks for the amazing repository! I wanted to load a pretrained model from Huggingface, which typically creates a folder with the `config.json` and the .bin file containing the weights inside.…
-
First, thanks for making this. It's very easy to get started with and has really helped me move things forward on a personal project of mine I've been struggling with for months. This is really awesom…
-
-
Hello, I am pretty new to MPI.
I am using stable-baselines DDPG for a custom environment. Everything is working fine and I am getting good results as well.
Question:
When I use MPI and run the co…