advantage-actor-critic Search Results

298 results
for advantage-actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ml5js/ml5-library #122

crossover and mutate functions

Just a feature request ! :) Could you implement a crossover function in order to perform reinforcement learning ? (genetic algorithm for me :) ) and obviously with crossover we need a simple mutat…

lastnod updated 4 years ago
4
ray-project/ray #6994

[tune] Use Aggregated metric for tuning across seeds

Not sure if this is already a feature or not, please forgive and provide insight :) While I haven't tried yet, I understand that tune has support for search algorithms (like BO, spearmint, etc.), w…

anqixu updated 4 years ago
4
privacytools/privacytools.io #779

❌ Software Removal | Signal

## Problem with Signal Signal has ***copious*** privacy issues making it unfit for privacytools.io endorsement. 1. Users are forced to supply a phone number to Signal (https://github.com/privacy…

ghost updated 3 years ago
124
eaplatanios/swift-rl #1

Left some API design comments on commits

I find the code super interesting and left some random comments on API naming. Take them as a grain of salt! I might start contributing at some point.

rxwei updated 4 years ago
2
piskvorky/gensim #1701

Doc2Vec training hangs

#### Description Hi, I tried training a model, with ``` from gensim.models import Doc2Vec model = Doc2Vec(min_count=1, window=10, size=100, sample=1e-4, negative=5, workers=7) model.…

tarun-t updated 4 years ago
4
hill-a/stable-baselines #572

[question] Custom PPO implementation does not train with Pon…

Hi I do not know if you might consider this as a question that I can ask you. I have been working with a PPO agent code that seemed to train for the environment (custom) that I have. However, in or…

itabhiyanta updated 4 years ago
5
Denys88/rl_games #13

error while running

Hi I am getting the error below while running the code: ``` Traceback (most recent call last): File "tf14_runner.py", line 144, in runner.run(args) File "tf14_runner.py", line 114, i…

AnujMahajanOxf updated 4 years ago
9
hill-a/stable-baselines #551

How exactly are the actor-critic networks created?

Deep Deterministic Policy Gradients ([DDPG][1]) and stable Baseline Code is presented [here][2]. The actor-critic networks are created as follows: normalized_obs = tf.clip_by_value(normali…

RyanRizzo96 updated 4 years ago
3
waylen94/Machine-Learning-Case-Study #1

20190916 Machine Learning study list

# Reinforcement Learning Study List -[] Brief of Reinforcement Learning -[] Methods -[] The reason to use -[] Preparation -[] Qlearning -[] Qlearning algorithm -[] Qlearning strategy -[…

waylen94 updated 4 years ago
2
llan-ml/tesp #2

关于shared policy的更新问题

在论文中给出的算法伪码，我理解的是，不论是task encoder的更新还是策略的更新都是直接对公式(10)进行SGD，这是一个meta-update，应该可以简单地理解成一个梯度更新。但是在源码中，我看到了ppo和a3c以及它们的loss，请问tesp需要借助ppo或者a3c的策略更新方式么？

zstbackcourt updated 4 years ago
5

上一页 1...22 23 24 25 26 27 28...30 下一页

298 results for advantage-actor-critic

298 results
for advantage-actor-critic