-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Current Behavior
请问作者是在单卡A100 40G显存条件下跑通全部流程的吗?包括后续的PPO阶段(需要同时塞两个模型)
### Expected Behavior
_No response_
##…
-
I'm trying to run vanilla PPO against either a single reward model or an ensemble of 5 reward models.
Command: `accelerate launch --main_process_port=29503 --config_file configs/accelerate_config.…
-
Hi! Thanks for your great sharing!
I met the `72956 segmentation fault` when I tried to train the task with `Pixels` suffix like `FrankaPickPixels`.
Besides, I have finished the training success…
-
The policy is given the last recurrent state from the replay buffer and isn't reset between episode boundaries. In my case I have the number of updates set to the episode length, so I've added `rollou…
bamos updated
5 years ago
-
When I run python3 pytorch_rl/main.py --no-vis --env-name Duckietown-small_loop-v0 --algo a2c --lr 0.0002 --max-grad-norm 0.5 --num-steps 20,there is an erro that tell me does't has the directory pyto…
-
-
This is found via https://github.com/pytorch-labs/torchfix/
`torch.load` without `weights_only` parameter is unsafe. Explicitly set `weights_only` to False only if you trust the data you load and f…
-
### What happened?
sympy2torch produces a module that fails when called if a function of a constant is present in the expression.
For example:
```
from sympy import symbols, exp
from pysr impor…
-
### Feature description
Categorical distribution sampling across multiple dimensions like the Pytorch multinomial function.
### Feature motivation
I am trying to write a RL (PPO) algorithm in…
-
Currently, there is a working multi-agent PPO implementation here:
[https://github.com/matteobettini/rl/blob/mappo_ippo/examples/multiagent/mappo_ippo.py](url)
and a working single-agent DDPG impl…