actor-critic-algorithm Search Results

764 results
for actor-critic-algorithm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

google-deepmind/trfl #6

Add/alias dpg critic update

Hi, the DPG critic update (see Algorithm 1 of Lillicrap et al. 2016, https://arxiv.org/abs/1509.02971) is substantively the same as your td_learning function; however, this is currently obscured. I wo…

spitis updated 5 years ago
2
germain-hug/Deep-RL-Keras #1

Question about A2C

Hi there, thanks for sharing your code -- its been very helpful! One question: is your implementation of the A2C a 'genuine' actor-critic method? My (limited) understanding was that to qualify as …

Khev updated 5 years ago
4
pytorch/pytorch #12327

Implement target derivative for smooth L1 loss

## 🚀 Feature Implement target derivative for `F.smooth_l1_loss` ## Motivation I'm implementing an actor-critic algorithm. On the TD update step, I need gradients through both the input and the ta…

phizaz updated 6 years ago
2
kairproject/schedule #13

[120분] DDPG 논문 및 코드 리뷰

whikwon updated 5 years ago
10
src-d/reading-club #21

Next paper candidates: 14 Dec

# Next paper candidates Let's propose papers to study next! All papers mentioned in the comments of this issue will be listed in the next vote.

m09 updated 5 years ago
4
google-deepmind/trfl #9

ImportError: cannot import name gen_distribution_ops

When I try to import trfl, similarly to [this](https://colab.research.google.com/drive/1yP8E9_CCO4NZ5XMYYrPOqSLfR4LlVeB0#scrollTo=Axy2D-N7InE9) public trfl colab notebook online, I get (Note I tri…

ryanprinster updated 5 years ago
19
openai/spinningup #4

Passing training flag for batch normalization ?

Hello, I am wondering how do you provide the training flag for batch normalization layers in your architecture specifying an actor critic function. During inference when generating the actions …

unrealwill updated 5 years ago
5
openai/baselines #492

Confused about Her+DDPG policy-loss

The policy-loss in the her+ddpg implementation is defined as following: ``` self.pi_loss_tf = -tf.reduce_mean(self.main.Q_pi_tf) self.pi_loss_tf += self.action_l2 * tf.reduce_mean(tf.square(self.ma…

astier updated 6 years ago
8
keras-rl/keras-rl #186

How to use continiously?

I understand how to use the keras-rl framework in a limited train / test workflow as demonstrated in some of the samples. But, how would one implement keras-rl in a scenario where one wants to depl…

olavt updated 5 years ago
14
google/dopamine #45

Extendability For Policy Gradients?

Is this extendable for policy gradient or actor-critic architectures? Or would one have to do major re-workings? I'm trying to decide whether to use this framework for a project or implement from scra…

slerman12 updated 5 years ago
4

上一页 1...69 70 71 72 73 74 75...77 下一页

764 results for actor-critic-algorithm

764 results
for actor-critic-algorithm