actor-critic-algorithm Search Results

750 results
for actor-critic-algorithm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

GuessWhatGame/guesswhat #18

About pre-trained model

Hi, I tried to run your code. I ran the train-qgen-reinforce.py (with pre-trained model provided). The initial score is similar to your accuracy provided README. I have a question. Is the pre-trai…

choys73 updated 6 years ago
7
pemami4911/deep-rl #15

A problem about the DDPG

Hi, I want to implement the DDPG algorithm, and before that, i've read your code. It's very useful. And I still have some little questions about the code. 1.As in the DDPG.py, line 61 to 66: …

geweihgg updated 6 years ago
5
hongzimao/pensieve #4

a few questions about the math

1) The training law of the Actor network (Eq.2) uses the gradient of the network times the reward difference A, to evolve the model of the neural network Is the term "detla_theda log pi_theda (s_t,…

shenyueshi updated 6 years ago
1
VinF/deer #60

[Feature Request] Weight Normalization

Hi VinF, your library is very helpful. Thank you! [Weight normalization](https://arxiv.org/abs/1602.07868) might be a way to make SGD-based algorithms suitable for a wider range of environments …

p-ruediger updated 7 years ago
6
hardmaru/estool #3

Training time for BipedalWalkerHardcore-v2

First off, terrific work on repo and blog post, very detailed and clear. I was able to solve the BipedalWalkerHardcore-v2, average 300+ for 100eps, with rl with an a3c implentation I made but it t…

dgriff777 updated 6 years ago
11
google-deepmind/pysc2 #95

How is the random agent trained?

I can not understand where the data about each start is saved Tell me please

botdot updated 6 years ago
10
tensorflow/swift #39

enhancement - generative model sample code / gan zoo

to foster community involvement - some richer sample code beyond MNIST should be tackled. Generative Adversarial Networks is a hot topic amongst ML - and some sample code using swift should help enco…

johndpope updated 6 years ago
1
2017-fall-DL-training-program/Reinforcement_Learning #6

Why is "experience replay" a reasonable approach?

Dear TA, Here is the pseudo code in reference [1] ![2017-12-21_151522](https://user-images.githubusercontent.com/32902010/34244772-f1609d7a-e661-11e7-8128-0c20f2e13c06.jpg) yj is the estimate…

bcpenggh updated 6 years ago
8
hongzimao/pensieve #7

a few more questions if you don't mind :-)

Hi, Hongzi If you don't mind, may I ask you a few more questions? Firstly, I have two general questions: 1) Is the data flow of critic network completely separate from the one of the actor ne…

shenyueshi updated 6 years ago
6
shixzie/nlp #5

Multiple models ?

Hello, I'm now facing another problem, I wanted to register several models to find many information, here is the problem: ```golang package main import ( "fmt" "time" "github.com/shixz…

metal3d updated 6 years ago
4

上一页 1...69 70 71 72 73 74 75...75 下一页

750 results for actor-critic-algorithm

750 results
for actor-critic-algorithm