trpo Search Results - Githubissues

783 results
for trpo

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

long8v/PTIR #187

[168] Proximal Policy Optimization Algorithms

[paper](https://arxiv.org/pdf/1707.06347) ## TL;DR - **I read this because.. :** 배경지식 차 - **task :** RL - **problem :** q-learning은 너무 불안정하고, trpo 는 상대적으로 복잡. data efficient하고 sclable한 arch…

long8v updated 3 months ago
1
rlworkgroup/garage #1181

Add RNN support to torch/PPO and torch/TRPO

ryanjulian updated 3 years ago
3
openai/baselines-results #8

Mujoco v2 baseline results

It would be good to have Mujoco baseline results for ACKTR, A2C, TRPO , PPO and DDPG for Mujoco v2 environment update. It would be good to have benchmark results compare against.

erincmer updated 5 years ago
1
yandexdataschool/Practical_RL #513

`week09_policy_II` – `trpo` slides has an invalid link

The link of the slides is invalid – [slides #1 (trpo)](https://docs.google.com/presentation/d/15Z_AVBsO9VuOSZ5uY-Q4by3tHKiRSENchhAKHhCxIOc/present?token=AC4w5VgM6o7lCOmwtNFI3lfzyPv2PHOpRQ%3A1511795215…

AI-Ahmed updated 1 year ago
4
openai/baselines #121

Taking action limits into account in PPO/TRPO/ACKTR.

This is more of a question than an issue. I noticed that in the implementation of the above-mentioned algorithms, action limits are not taken into account. Environments handle this clipping internally…

hamzamerzic updated 4 years ago
37
Geonhee-LEE/mobile_robot_control #10

Package architecture

Package architecture: - controllers: > class control: PID, pure pursuit, bang-bang, open-loop(velocity profile), ... > optimal control: lqr, ddp, mpc, ... > collision avoidance: RVO, O…

Geonhee-LEE updated 2 years ago
1
vispy/vispy #1198

Issue with get_transform when camera fov!=0

I'm trying to convert the coordinates of markers from the visual coordinates to canvas coordinates. If I do this with a camera with fov=0, everything works fine: ``` python import numpy as np from vi…

astrofrog updated 8 years ago
1
StepNeverStop/RLs #34

Check that the code implementation is accurate and reasonabl…

- [x] check and fix C51 [deaab73] - [x] check qrdqn [deaab73] - [ ] check iqn - [ ] check and fix Rainbow - [ ] check on-policy buffer sampling - [ ] check function `discounted_sum` - [ ] check …

StepNeverStop updated 3 years ago
2
wil3/openai-baseline #1

How to use this repo?

Hey there, I wanna reproduce your work in your final thesis. I did see that you have some scripts to run PPO, DDPG and TRPO, but I think the directory is kind of hard for me to understand, can you exp…

DusKing1 updated 3 years ago
1
rll/rllab #126

Why is a MeanKLBefore not equal to zero in TRPO?

Hi! MeanKLBefore is defined at optimize_policy in npo.py ``` npo.py def optimize_policy(self, itr, samples_data): all_input_values = tuple(ext.extract( samples_data, …

rarilurelo updated 7 years ago
3

上一页 1...4 5 6 7 8 9 10...79 下一页

783 results for trpo

783 results
for trpo