trpo Search Results - Githubissues

783 results
for trpo

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

arXivTimes/arXivTimes #366

Proximal Policy Optimization Algorithms

## 一言でいうと Policy gradientは様々なタスクで利用されているが、戦略の更新幅の設定が難しく、小さいと収束が遅くなり大きいと学習が破綻する問題があった。そこで、TRPOという更新前後の戦略分布の距離を制約にするモデルをベースに、より計算を簡略化したPPOという手法を開発した。 ### 論文リンク https://openai-public.s3-us-west-…

icoxfog417 updated 7 years ago
2
Khrylx/PyTorch-RL #26

Various questions?

Hi, Thanks a lot for this extremely useful implementation. I wanted just to ask what is the ZFilter class, is it used to standardize the observed state according to the running mean and std of t…

lviano updated 2 years ago
1
open-gamma-ray-astro/gamma-astro-data-formats #191

What to do if OGIP conventions and the FITS standard are in …

I just noticed a conflict between the OGIP conventions and the FITS standard, which is explicitly commented on in the FITS time paper, which definitions were taken over for the FITS standard version 4…

maxnoe updated 1 month ago
3
nikhilbarhate99/PPO-PyTorch #65

policy_old完全看不出作用

现在的数据流程是 1. policy_old = policy 2. 使用policy_old去交互，生成数据 3. 使用数据去更新policy模型 4. policy_old = policy 在这个流程中，policy_old完全没有作用，或者说代码中去掉policy_old，使用policy进行替代，最终的结果完全一致所以这个真的是PPO么？？

haduoken updated 8 months ago
6
rll/rllab #169

Could not open "params.pkl" after running trpo_cartpole_pick…

Hi, I have just installed rllab envirtonment, and I run the example code trpo_cartpole_pickled.py successfully. And get the log file "debug.log params.pkl progress.csv variant.json". And when I am …

Gin8787 updated 7 years ago
1
openai/spinningup #333

AttributeError: module 'tensorflow._api.v2.train' has no att…

Attempting the spinning up tutorial using windows and wsl2 by following the link given in the installation tutorial. After setting up conda and wsl2, I made my conda environment, then followed the …

mmcaulif updated 3 years ago
1
rlworkgroup/garage #1494

MultiprocessingSampler is broken for tf

Edit `examples/tf/trpo_swimmer/ray_sampler.py` to use MultiprocessingSampler and you will get: ```sh ValueError: Variable GaussianMLPPolicy/GaussianMLPModel/dist_params/mean_network/hidden_0/kerne…

ryanjulian updated 4 years ago
3
google-code-export/trpo #6

работа с заказчиком...

``` В файле задание на второй релиз. Пожалуйста, задавайте вопросы, если что непонятно или с Вашей точки зрения может иметь различное толкование. И.Г. ``` Original issue reported on code.google.co…

GoogleCodeExporter updated 9 years ago
19
openai/baselines #584

How to generater deterministic.ppo2...npz and stochastic.ppo…

Because i want to use ppo2 or trpo to sample a random policy and use gail to imitation learning. Can you share some idea with me? Your help will be my great honor.

huangjiancong1 updated 6 years ago
2
openai/baselines #698

Unify observation normalization code

Some algorithms use RunningMeanStd object and call update within algorithm (e.g. ddpg, trpo_mpi), others rely on VecNormalize env wrapper for observation normalization. Also, MPI support for VecNorm…

pzhokhov updated 6 years ago
1

上一页 1...6 7 8 9 10 11 12...79 下一页

783 results for trpo

783 results
for trpo