trpo Search Results - Githubissues

783 results
for trpo

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

rlworkgroup/garage #1020

PyTorch on CPU is slower than TF

See https://github.com/pytorch/pytorch/issues/975 for more info PyTorch TRPO appears 50% slower than TF. Not sure about PPO, but I expect the wall-clock time gap will be the same. To fix this is…

ryanjulian updated 3 years ago
4
UIKit0/trpo #6

работа с заказчиком...

``` В файле задание на второй релиз. Пожалуйста, задавайте вопросы, если что непонятно или с Вашей точки зрения может иметь различное толкование. И.Г. ``` Original issue reported on code.google.co…

GoogleCodeExporter updated 9 years ago
19
google-deepmind/dm_control #220

dm_control.rl.control.PhysicsError: Physics state is invalid…

Hi, I am trying to reproduce the experiment from paper "Learning human behaviors from motion capture by adversarial imitation". I used the code from https://github.com/huiwenzhang/merel-mocap-gail. Bu…

13253591602 updated 2 years ago
2
openai/baselines #557

deepq and trpo_mpi call two resets consecutively, and resent…

Reinforcers, (that was cheezy...) Both deepq and trpo_mpi call two resets consecutively for every "done" sent into the algorithm. Not sure what the facility of that is. Also, they both exit o…

hellandhansen updated 6 years ago
3
rail-berkeley/softlearning #76

SAC Hyperparameters MountainCarContinuous-v0 - Env with dece…

Hello, I've tried in vain to find suitable hyperparameters for SAC in order to solve MountainCarContinuous-v0. Even with hyperparameter tuning (see "add-trpo" branch of [rl baselines zoo](https:…

araffin updated 4 years ago
10
wojzaremba/trpo #7

About kl_firstfixed

thanks for implementation of trpo, there exist some details that do not make sense to me so far I can't see why kl_firstfixed is defined as following `kl_firstfixed = tf.reduce_sum(tf.stop_gradient…

PeiYingjun updated 6 years ago
2
rock-learning/bolero #9

Add more policy search algorithms and policy representations

Policy Search - [ ] [PI2](http://proceedings.mlr.press/v9/theodorou10a/theodorou10a.pdf), is already implemented #28 - [ ] [PoWER](http://www.ias.informatik.tu-darmstadt.de/publications/peters_ADPR…

AlexanderFabisch updated 5 years ago
1
rlworkgroup/garage #2200

update documentation on how to use rnns with tf/torch[pendin…

the error a contributor got when using the `categoricalgrupolicy` with `TRPO` on the `tf` branch, computing backwards passes was ``` tensorflow.python.framework.errors_impl.InvalidArgumentError: No…

avnishn updated 3 years ago
3
rlworkgroup/garage #2190

tf/NPO is not compatible with max_episode_length=None

I am creating a child class that inherits from TRPO, but upon initializing the optimizer, I get `TypeError: Failed to convert object of type to Tensor` Here is the overall stacktrace: ```py…

ManavR123 updated 3 years ago
5
LaurentMazare/tch-rs #797

PPO example is actually A2C

I noticed while browsing the RL examples that the PPO implementation is actually A2C (which there's already an example for). On line 141, this line: ```Rust let action_loss = (-advantages.detach()…

Boxxfish updated 1 year ago
1

上一页 1...8 9 10 11 12 13 14...79 下一页

783 results for trpo

783 results
for trpo