proximal-policy-optimization Search Results

173 results
for proximal-policy-optimization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

alifanov/algotrading #1

Training EC Slowly ?

Hi alifanov, Thanks for giving out your code, it's a very good example. I try to train simple-EC , But I feel very slowly . Doesn't it take into account more CPU to train EC by synchronous ? T…

cn3c3p updated 7 years ago
6
HumanCompatibleAI/imitation #835

Integration of Rule-Based Bot Actions for Imitation Learning

Background: I am currently working on a custom environment for the Battle City game using Gymnasium and Stable Baselines v3. My objective is to train an agent using the Proximal Policy Optimization…

vladyskai updated 8 months ago
2
junxiaosong/AlphaZero_Gomoku #49

关于KL散度控制学习率的问题

您好，注意到代码中有通过比较新旧两个神经网络输出的KL散度来控制学习率的方法，实验过程中学习率先快速增加然后逐渐减少，说明这个方法确实有用。想问一下这种方法有相关的文献资料的介绍吗？还是您凭经验创造出来的呢？

rommeldhy updated 4 years ago
4
RLE-Foundation/rllte #30

[Progress Report] Construction of RLLTE Data Hub

Due to the high computing power required for training, we will gradually upload data to the data hub and report the progress in this issue. We will also change the priority of training according to ne…

yuanmingqi updated 1 year ago
3
AnSrwn/Parkr #10

Train model with SAC or GAIL

> Der nächste Schritt wäre einen Agenten mit zwei Optimierungsalgorithmen zu trainieren. Hierfür könnten Sie im Reinforcement Learning-Bereich den PPO und DQN Algorithmus verwenden. Sie könnten aber a…

AnSrwn updated 4 years ago
1
dennybritz/reinforcement-learning #238

Reinforcement learning policy

I want to make a project using reinforcement learning in which a bot send scam to other bots on social media, other bots detect the scam and reject it. I think it needs a deep reinforcement learning…

Comp-Engr18 updated 7 months ago
1
usc-bbdl/usc-bbdl.github.io #86

Publication info - Ali 3

Delete all nonrelevant information from this template when submitting your issue request: #Publication ``` Title: The utility of tactile force to autonomous learning of in-hand manipulation is …

marjanin updated 4 years ago
2
AkihikoWatanabe/paper_notes #807

Secrets of RLHF in Large Language Models Part I: PPO, Rui Zh…

# URL - https://arxiv.org/abs/2307.04964 # Affiliations - Rui Zheng, N/A - Shihan Dou, N/A - Songyang Gao, N/A - Wei Shen, N/A - Binghai Wang, N/A - Yan Liu, N/A - Senjie Jin, N/A - Qi…

AkihikoWatanabe updated 1 year ago
2
202219807/700099_MSC_22_039 #6

Design neural network

202219807 updated 1 year ago
2
Farama-Foundation/stable-retro #112

Stable-retro Raspberry Pi 5 ForkServerProcess-1Error

I am trying to train a model using PPO, and the stable-baseline3[extra] library is also installed. The issue occurs because the StochasticFrameSkip object does not have an action_space attribute, l…

StartaBafras updated 3 months ago
2

上一页 1...1 2 3 4 5 6 7...18 下一页

173 results for proximal-policy-optimization

173 results
for proximal-policy-optimization