ppo Search Results - Githubissues

1000+ results
for ppo

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ikostrikov/pytorch-a2c-ppo-acktr-gail #238

FPS calculation

While training, the number of frames used so far is computer as `total_num_steps = (j + 1) * args.num_processes * args.num_steps` Shouldn't this be multiplied by the number of stacked frames (def…

Xemnas0 updated 4 years ago
3
p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch #38

DDPG on MountainCar

Hi p-christ, Thanks for this amazing contribution. Recently, I tried the implementation of DDPG for MountainCar (with default parameters in results/Mountain_Car.py). However, the results are quite …

gearsuccess updated 3 years ago
2
smt-HS/I-SEE #2

best model

hello, how to evaluate the model? I use the test command from GDPL, got low success rate. the command and outputs are below. python main.py --test True --load model_rl/best > result.txt DEBUG:r…

shunjiu updated 2 months ago
2
Stable-Baselines-Team/stable-baselines3-contrib #81

[Bug] An error in MaskPPO training

System Info Describe the characteristic of your environment: Describe how the library was installed: pip sb3-contrib=='1.5.1a9' Python: 3.8.13 Stable-Baselines3: 1.5.1a9 PyTorch: 1.11.0+cu102…

Yangxiaojun1230 updated 1 year ago
19
rlworkgroup/garage #1020

PyTorch on CPU is slower than TF

See https://github.com/pytorch/pytorch/issues/975 for more info PyTorch TRPO appears 50% slower than TF. Not sure about PPO, but I expect the wall-clock time gap will be the same. To fix this is…

ryanjulian updated 3 years ago
4
AkihikoWatanabe/paper_notes #1201

Some things are more CRINGE than others: Preference Optimiza…

# URL - https://arxiv.org/abs/2312.16682 # Affiliations - Jing Xu, N/A - Andrew Lee, N/A - Sainbayar Sukhbaatar, N/A - Jason Weston, N/A # Abstract - Practitioners commonly align large langu…

AkihikoWatanabe updated 6 months ago
2
utiasDSL/gym-pybullet-drones #177

learn.py, expected performance, steps, and hardware?

Hello. I'm attempting to run learn.py on the hover test environment, and wondering if anyone has had any luck with this so far. I admittedly haven't tried 1E12 training steps quite yet, but after …

MatthewCWeston updated 7 months ago
2
edbeeching/godot_rl_agents #64

Suggestion for docs: Graphing progress with Tensorboard

Hello, I just have a quick suggestion for the documentation: Running the command `Tensorboard --logdir=~\ray_results\PPO\editor\ --port=8008` in a terminal while/after the program is tra…

Jetpackjules updated 7 months ago
1
microsoft/DeepSpeed #3267

[REQUEST] RRHF

**Is your feature request related to a problem? Please describe.** We have posted a paper with codes [RRHF] (https://github.com/GanjinZero/RRHF) that can achieve human alignment without RLHF. RRHF ne…

GanjinZero updated 11 months ago
1
miroblog/tf_deep_rl_trader #1

render() got an unexpected keyword argument 'close'

first I want to thank you for your great share. It very rare to find trading reinforcement learning system with ppo. I have an error when I run this code. SInce i dont have talib installed i replace…

greg2paris updated 5 years ago
6

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for ppo

1000+ results
for ppo