policy-gradient Search Results

1000+ results
for policy-gradient

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/deep-rl-class #394

Translating to Russian

Hi! Let's bring the reinforcement learning course to all the Russian-speaking community 🌏 Would you want to translate? Please follow the 🤗 [TRANSLATING guide](https://github.com/huggingface/tran…

artyomboyko updated 10 months ago
53
Lightning-Universe/lightning-bolts #185

Optimise RL Code

## 🚀 Feature There seems to be fair few inefficiencies in the RL model code. In both the VPG and DQN code, the network is computed twice, once to generate the trajectory and then once again in the…

HenryJia updated 2 years ago
1
JdeRobot/BehaviorMetrics #140

Introduce Continuous Control F1RL Environment

Current Setup: DQN and Q-learning are already done with Discrete Control Actions To be introduced: Policy Gradient Algorithm with Continuous Control Actions Major Changes: 1. Add a continuous…

UtkarshMishra04 updated 3 years ago
3
pochih/RL-Chatbot #10

How to train reversed model for RL model

You mentioned " When training with policy gradient (pg) you may need a reversed model the reversed model is also trained by cornell movie-dialogs dataset, but with source and target reversed. …

bobbercheng updated 6 years ago
1
tensorflow/agents #220

Second Order gradient is not working

I try to implement getting the second order gradient by tf_agent. The reason why I do second order gradient is came from meta-learning algorithm [MAML](https://arxiv.org/abs/1703.03400). First I c…

zzong2006 updated 5 years ago
1
Zeta36/Asynchronous-Methods-for-Deep-Reinforcement-Learning #1

Implement the actor-critic methods

Hello, In the [asynchronous dqn paper](http://arxiv.org/pdf/1602.01783v1.pdf), they also described an on policy method, the advantage actor-critic (A3C), which achieved better results than others, do …

originholic updated 8 years ago
1
huggingface/deep-rl-class #370

🌐 [i18n-KO] Translating rl-course to Korean

Hi! Let's bring the reinforcement learning course to all the Korean-speaking community 🌏 (currently 9 out of 77 complete) Would you want to translate? Please follow the 🤗 [TRANSLATING guide](ht…

wonhyeongseo updated 1 year ago
1
RobotLocomotion/drake #20451

Return the probability for the sample in Distribution class

In Drake `Distribution` class, currently we support `Sample()` function, https://github.com/RobotLocomotion/drake/blob/40e116d44929301d261f15f4d79c0d29b1e8293f/common/schema/stochastic.h#L203-L213 …

hongkai-dai updated 1 year ago
4
coreylynch/async-rl #8

Stop actor gradient flowing through the critic

I think you should use `tf.stop_gradient()` in https://github.com/coreylynch/async-rl/blob/master/a3c.py#L164. Otherwise, after some training the policy tends to use one action exclusively. Took me a …

danijar updated 8 years ago
2
rail-berkeley/rlkit #53

Investigate super-convergence on RL algorithms

I have been using these two routines to figure out the best learning rate to apply with awesome results on SAC. However, the changes in the `temperature` alter those values along the way. Probably wou…

redknightlois updated 5 years ago
4

上一页 1...13 14 15 16 17 18 19...100 下一页

1000+ results for policy-gradient

1000+ results
for policy-gradient