ppo-pytorch Search Results

1000+ results
for ppo-pytorch

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Pillars-Creation/ChatGLM-RLHF-LoRA-RM-PPO #2

[BUG/Help] 请问作者是在单卡A100 40G显存条件下跑通全部流程的吗？包括后续的PPO阶段（需要同时塞两个模…

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 请问作者是在单卡A100 40G显存条件下跑通全部流程的吗？包括后续的PPO阶段（需要同时塞两个模型） ### Expected Behavior _No response_ ##…

BIT-Xu updated 11 months ago
4
tlc4418/llm_optimization #10

Unable to Run PPO Training Using HuggingFace Path of SFT'd l…

I'm trying to run vanilla PPO against either a single reward model or an ensemble of 5 reward models. Command: `accelerate launch --main_process_port=29503 --config_file configs/accelerate_config.…

RylanSchaeffer updated 3 months ago
1
ir413/mvp #17

72956 segmentation fault

Hi! Thanks for your great sharing! I met the `72956 segmentation fault` when I tried to train the task with `Pixels` suffix like `FrankaPickPixels`. Besides, I have finished the training success…

zichunxx updated 2 months ago
1
ikostrikov/pytorch-a2c-ppo-acktr-gail #206

Recurrent states not reset between episode boundaries

The policy is given the last recurrent state from the replay buffer and isn't reset between episode boundaries. In my case I have the number of updates set to the episode length, so I've added `rollou…

bamos updated 5 years ago
3
duckietown/gym-duckietown #278

Can't find pytorch_rl.

When I run python3 pytorch_rl/main.py --no-vis --env-name Duckietown-small_loop-v0 --algo a2c --lr 0.0002 --max-grad-norm 0.5 --num-steps 20,there is an erro that tell me does't has the directory pyto…

eisbzuwnaj updated 1 year ago
1
rlworkgroup/garage #1181

Add RNN support to torch/PPO and torch/TRPO

ryanjulian updated 3 years ago
3
DLR-RM/stable-baselines3 #1852

`torch.load` without `weights_only` parameter is unsafe

This is found via https://github.com/pytorch-labs/torchfix/ `torch.load` without `weights_only` parameter is unsafe. Explicitly set `weights_only` to False only if you trust the data you load and f…

kit1980 updated 6 months ago
18
MilesCranmer/PySR #656

[BUG]: torch export fails for expressions with constant inpu…

### What happened? sympy2torch produces a module that fails when called if a function of a constant is present in the expression. For example: ``` from sympy import symbols, exp from pysr impor…

tbuckworth updated 1 month ago
12
tracel-ai/burn #1121

[Feature Request] Categorical Distribution Sampling

### Feature description Categorical distribution sampling across multiple dimensions like the Pytorch multinomial function. ### Feature motivation I am trying to write a RL (PPO) algorithm in…

EddieMataEwy updated 8 months ago
4
pytorch/rl #1317

[Feature Request] Working Multi-Agent DDPG Implementation

Currently, there is a working multi-agent PPO implementation here: [https://github.com/matteobettini/rl/blob/mappo_ippo/examples/multiagent/mappo_ippo.py](url) and a working single-agent DDPG impl…

Acciorocketships updated 1 year ago
6

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for ppo-pytorch

1000+ results
for ppo-pytorch