-
Trying to follow the simple README instructions on an Ubuntu server with 2x 4090 GPUs and CUDA 12.4:
```bash
Installing the current project: cleanrl (2.0.0b1)
(cleanrl) ➜ cleanrl git:(master) po…
-
# Implementing Proximal Policy Optimisation
I've used some of the [PyTorch RFC](https://github.com/pytorch/rfcs/blob/master/README.md) template here for clarity.
**Authors:**
* @salmanmohammadi…
-
Hello, apologies if I do this wrong I don't contribute to open source often. I was attempting to run the Pytorch PPO implementation and kept getting several errors regarding the dimension of the obser…
-
While testing the model i get this:
Traceback (most recent call last):
File "test.py", line 65, in
test(opt)
File "test.py", line 55, in test
state, reward, done, info = env.step(act…
-
`pip install stable-baselines3[extra]`
## Repro
```python
from stable_baselines3 import PPO
import torchdynamo
@torchdynamo.optimize("inductor")
def train():
model = PPO("MlpPoli…
-
First, thank you for your efforts in helping to bring accurate and performant RLHF techniques to the open-source community.
I'm raising this issue hoping to get some clarification on a couple implem…
-
Hello,
I followed steps mentioned [here](https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail#requirements) to install requirements for this repository. There is one minor change, I am using vi…
-
【**Existing code:**】
Only reset the environment at the beginning of training loop, that is, only call env.reset() at the first epoch.
【**Right(might) training paradigm**】
I checked OpenAI spinning-…
-
env--reacher
algo--ppo
error:
Traceback (most recent call last):
File "/home/al/Desktop/pytorch-a2c-ppo-acktr-gail-master/main.py", line 196, in
main()
File "/home/al/Desktop/pytorch-…
-
## Bug description
Hello,
I want to pass the policy learned from behavioural cloning in imitation library to PPO, I thought it would be successful since they are both from ActorCriticPolicy class,…