-
Dear author,
I have read your paper on MuJoCo experiments and I am particularly interested in the hyperparameters used for PPO_GRU and A2C_GRU. I would greatly appreciate it if you could provide me…
-
## Problem
How can I log the training rewards etc. to a Tensorboard log for the example GAIL training script?
```
import numpy as np
import gymnasium as gym
from stable_baselines3 import PPO
…
-
Hi, I have a question about dagger-value algorithm:
when updating value network, why do you use `torch.max()` to get the larger loss?
What's the meaning comparing these two losses? In my understa…
-
I am currently training my agent to do two things :- 1) Collect And Deposit Food 2) Take shelter on occurrence of certain things in the environment. I am using gail and Behavioral Cloning with PPO to …
-
## Bug description
Example notebook [1_train_bc.ipynb](https://github.com/HumanCompatibleAI/imitation/blob/master/docs/tutorials/1_train_bc.ipynb) gives ```Namespace not found error``` for ```seals``…
-
## Bug description
Your adversarial model implementations, including GAIL and AIRL, does not work well in MuJoCo environments. Tested on Hopper, HalfCheetah and Humanoid, and both AIRL and GAIL faile…
-
### 🐛 Bug
Having problem while running Atari example code on both stable and main branch
It seems due to the cfg didn't correctly pass to gymnasium make
### To Reproduce
```
python train_pp…
-
Some of the tutorials contained hyperparameters, that were not quite optimized. Also in some cases we say "increase this value to `x` to get actually good results". We should verify that those claims …
-
*Before proposing this issue, I have searched it on document, issues and search engine.*
My job is to reproduce the excellent performance of `GAIL` over `BC` in the setting of `Cartpole`, where the…
-
## Problem
Hi, I'm excited to use this amazing project.
I have an idea about GAIL-PPO. GAIL has the generator network and the discriminator network, while ppo has the actor network and the critic …