advantage-actor-critic Search Results

298 results
for advantage-actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

google-deepmind/trfl #16

Trouble Installing TRFL 1.0.1 in Colab

I tried installing trfl version 1.0.1 in Colab and am getting an error: `import trfl` ``` --------------------------------------------------------------------------- NotFoundError …

AurelianTactics updated 5 years ago
1
google-deepmind/trfl #9

ImportError: cannot import name gen_distribution_ops

When I try to import trfl, similarly to [this](https://colab.research.google.com/drive/1yP8E9_CCO4NZ5XMYYrPOqSLfR4LlVeB0#scrollTo=Axy2D-N7InE9) public trfl colab notebook online, I get (Note I tri…

ryanprinster updated 5 years ago
19
germain-hug/Deep-RL-Keras #1

Question about A2C

Hi there, thanks for sharing your code -- its been very helpful! One question: is your implementation of the A2C a 'genuine' actor-critic method? My (limited) understanding was that to qualify as …

Khev updated 5 years ago
4
ikostrikov/pytorch-a2c-ppo-acktr-gail #146

Possible bug in A2C/ACKTR?

I've been going through the update code for A2C in master/algo/a2c_acktr.py here: ```python def update(self, rollouts): obs_shape = rollouts.obs.size()[2:] action_shape = rollout…

siddk updated 5 years ago
2
src-d/reading-club #21

Next paper candidates: 14 Dec

# Next paper candidates Let's propose papers to study next! All papers mentioned in the comments of this issue will be listed in the next vote.

m09 updated 5 years ago
4
pytorch/pytorch #9022

Help needed in torch.diag()

``` class ContinousActorCritic(nn.Module): def __init__(self, observation_space, action_space, hidden_size, sigma=0.3): super(ContinousActorCritic, self).__init__() self.sigma = torch.te…

random-user-x updated 6 years ago
6
openai/baselines #393

A2C Loss Function Value vs Policy

The A2C loss function used in this repository is the following: ` neglogpac = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=train_model.pi, labels=A) pg_loss = tf.reduce_mea…

tanh314 updated 6 years ago
1
heronsystems/adeptRL #2

MPI Error - Impala

Hi I'm trying to run the example on a cuda cluster - I am running adeptRL/0.1.1 glibc/2.14 python/3.7.0 …

SethKitchen updated 5 years ago
12
GuessWhatGame/guesswhat #22

I want pure REINFORCE based code.

Hi, I would like to experiment with my QGen training model on your . When I read your code "train_qgen_reinforce.py" and report, I realized you used the "baseline, Q function" to reduce the variance…

yellowjs0304 updated 6 years ago
3
MorvanZhou/Reinforcement-learning-with-tensorflow #26

lambda parameter

https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow/blob/639902bdd8109e115b1e22575b8f9b467e09f863/contents/8_Actor_Critic_Advantage/AC_CartPole.py#L165 do you think we should speci…

mynameisvinn updated 6 years ago
1

上一页 1...24 25 26 27 28 29 30...30 下一页

298 results for advantage-actor-critic

298 results
for advantage-actor-critic