ppo-pytorch Search Results

1000+ results
for ppo-pytorch

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

iffiX/machin #26

len(tmp_observations) < 2 on PPO raise ValueError: The param…

It seems that your code produce error if the len of your trajectory < 2 ( len(tmp_observations) < 2). I tested this on PPO I don't know if this happens with all algorithms. The error: ValueError…

lorenzosteccanella updated 2 years ago
3
linyiLYi/street-fighter-ai #35

本项目如何使用GPU进行训练。谢谢各位大佬！大佬发财！

(StreetFighterAI) D:\街霸ai\street-fighter-ai\main>python train.py Using cpu device Wrapping the env in a VecTransposeImage. 目前的训练是使用cpu来进行的，如何使用GPU来进行？

goliouu updated 1 year ago
4
DLR-RM/rl-baselines3-zoo #321

[Feature Request] Support Stochastic Weight Averaging (SWA) …

### 🚀 Feature Stochastic Weight Averaging (SWA) is a recently proposed technique can potentially help improve training stability in DRL. There is now a new implementation in `torchcontrib`. Quoting/p…

pchalasani updated 1 year ago
2
epogrebnyak/mlmw #6

Reorganize beginner section

Updates from: - https://github.com/jacobhilton/deep_learning_curriculum (focus on transformers) - Raschka book 1. Math prerequisites Taking a derivative to find a point of minimum or maxim…

epogrebnyak updated 3 months ago
3
ml-explore/mlx #451

[BUG] value_and_grad is really slow

I am implementing a version of PPO in MLX and wanted to benchmark it against my PyTorch implementation. Sadly, the performance (samples per second) was really quite bad, so I benchmarked all the diffe…

cowolff updated 8 months ago
12
ikostrikov/pytorch-a2c-ppo-acktr-gail #238

FPS calculation

While training, the number of frames used so far is computer as `total_num_steps = (j + 1) * args.num_processes * args.num_steps` Shouldn't this be multiplied by the number of stacked frames (def…

Xemnas0 updated 4 years ago
3
ikostrikov/pytorch-a2c-ppo-acktr-gail #202

EOFError encountered when training on CartPole-v0

I can run the code on PongNoFrameskip-v4 without problems: `python main.py --env-name "PongNoFrameskip-v4" --algo ppo` However when I run the code on CartPole-v0: `python main.py --env-name "Cart…

SunHaozhe updated 2 years ago
1
kengz/SLM-Lab #383

Real recurrent policy supported

**Are you requesting a feature or an implementation?** To handle the partial MDP task, the recurrent policy is currently quite popular. We need to add a lstm layer after the original conv (or mlp) …

yangysc updated 5 years ago
2
openpsi-project/ReaLHF #24

有多机分布式任务的example吗？

nice work！想请问是否有多机的 example 示例或者论文中实验的复现脚本，我看代码似乎是必须用 slurm 起机器是吗？例如我想复现论文中 70B+70B 的 end2end 实验的话能否给出步骤和建议呢，谢谢！

PKUFlyingPig updated 1 month ago
5
xbpeng/awr #2

Parameters used for motion imitation

Hello, I am trying to use this algorithm (rewritten in PyTorch with Gym vectorized envs) for motion imitation, starting with the PyBullet implementation of the DeepMimic environment. In the paper, …

ManifoldFR updated 4 years ago
6

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for ppo-pytorch

1000+ results
for ppo-pytorch