actor-critic-algorithm Search Results

766 results
for actor-critic-algorithm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Unity-Technologies/ml-agents #763

question about tensor borad

I wonder what Tensorborad's "value loss"is meaning. in this tutorial https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Getting-Started-with-Balance-Ball.md#observing-training-progre…

green4you updated 4 years ago
3
google-deepmind/scalable_agent #26

Potential discrepancy vis-a-vis the original IMPALA paper

In the code given in this page, it seems like the agents are unrolled 100 steps after which the (possibly partial) trajectory is sent to the learner. If the trajectory is not finished at that point, t…

snailrowen1337 updated 4 years ago
1
openai/spinningup #190

Deterministic actions using PPO

Hi How can I get a deterministic action out of PPO policy? I need to turn off the exploration noise for that; but there doesn't seem to be such a switch in the code.

donamin updated 4 years ago
2
rail-berkeley/rlkit #91

No value function in Soft Actor Critic?

Dear author, In your implementation of soft actor critic, there is no value function V(s)? In the original paper of SAC, the authors said such value function can stabilize training and is c…

KK666-AI updated 4 years ago
1
SciSharp/TensorFlow.NET #438

Integrate with ml-agents

If tf.net can be connected to this, it should be a lot easier. py often encounters some incompatibility problems, it is not easy to debug. Unity Machine Learning Agents Toolkit https://github.com/…

Deep-Blue-2013 updated 4 years ago
5
openai/gym #1758

Differences between Hopper-v1 and Hopper-v2

Hi, To my knowledge, I think hopper-v1 is deprecated and Hopper-v2 is the standard hopper as of today. Can someone validate if this is true ? In most of the RL papers, I see results where the au…

HareshKarnan updated 4 years ago
3
rlworkgroup/garage #1007

Question: Is Actor-Critic related algorithms being added any…

kishanpb updated 5 years ago
1
hill-a/stable-baselines #551

How exactly are the actor-critic networks created?

Deep Deterministic Policy Gradients ([DDPG][1]) and stable Baseline Code is presented [here][2]. The actor-critic networks are created as follows: normalized_obs = tf.clip_by_value(normali…

RyanRizzo96 updated 4 years ago
3
hill-a/stable-baselines #198

[feature request] Implement goal-parameterized algorithms (H…

I'd like to implement Hindsight Experience Replay (HER). This can be based on a whatever goal-parameterized RL off-policy algorithm. **Goal-parameterized architectures**: it requires a variable for…

ccolas updated 4 years ago
22
theogruner/rl_pro_telu #4

Non-ASCII character '\xce' in file

Hello, thanks for making this repo, I tried to connect my env and run it but I get the following error, **SyntaxError: Non-ASCII character '\xce' in file /home/at-lab/catkin_ws3/rl_pro_telu/mpo/mpo…

murtazabasu updated 4 years ago
3

上一页 1...63 64 65 66 67 68 69...77 下一页

766 results for actor-critic-algorithm

766 results
for actor-critic-algorithm