-
### What is your question?
My goal is to learn a single policy that is deployed to multiple agents (i.e. all agents learn the same policy, but are able to communicate with each other through a shar…
-
### ❓ Question
EDIT: After doing some more digging I updated the post title and added more details with a newer version of SB3 (1.6.2)
I am using OpenAI gym-retro env to train on games and migra…
-
### 🐛 Bug
When using tensorboard integration with SAC no data are written on the events file.
The model training is done without problem and the metrics are correctly stored in `self.logger.name_to…
-
### 🚀 The feature, motivation and pitch
An issue that has been debated ad nauseam and apparently still doesn't have an agreed upon answer as of PyTorch 1.12 is how or if to set a default device for…
-
### 🐛 Bug
the `log_std` tensor gets filled completely with NaNs and causes a `ValueError` exception during training with PPO.
have tried using both `use_expln=True/False` as mentioned in https://gi…
-
Thanks for making this great repo.
I'm trying to run Dreamer-v2 and I can't get it to work.
Steps to reproduce:
- Install from main (commit 0fae2a9fc990b0b53332eccd4ea7ecba435fa71f)
- Instal…
-
This issue highlights some differences I found when comparing the [SAC implementation of Minghoa ](https://github.com/rickstaa/Actor-critic-with-stability-guarantee/blob/master/LAC/SAC_cost.py) with t…
-
In this issue, the results of two new architectures DLAC and LSAC are compared with the original LAC algorithm. To do this I will use the oscillator environment. I will also set the Environment and Al…
-
**Describe the bug**
First of all, I am not sure whether it is a bug or not.
I was training my own model with DDP enabled which ran into this error. Then I took examples/distributed_offline_tra…
-
### What happened + What you expected to happen
When running the SAC algorithm on cartpole env grid searching over tf2 and torch frameworks (following the tuned example [here](https://github.com/ray-…