advantage-actor-critic Search Results

298 results
for advantage-actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ray-project/ray #7480

RLlib: policy_map[name] = cls(obs_space, act_space, merged_c…

### What is the problem? In rollout_worker.py, when the `cls` is `TFPolicy` or subclass of `TFPolicy`, the following line will fail ``` policy_map[name] = cls(obs_space, act_space, merged…

GoingMyWay updated 4 years ago
3
PaddlePaddle/PARL #149

paddle.fluid.core_avx.EnforceNotMet: Input(C@GRAD) should no…

I want to Implement COMA with parl, and I use two fluid.Program() to train critic and actor respectively. however I meat two error related to optimizer. ### error 1: code: ```python def lear…

a-big-tomato updated 4 years ago
2
PaddlePaddle/Paddle #19665

提示梯度为空的报错

Paddle 版本：1.5.1 背景：复现算法COMA，multi-aget算法详细错误提示： ### error 1: code: ```python def learn(self, obs, actions, last_actions, q_vals, lr): """ Args: obs: [4*env*ba…

TomorrowIsAnOtherDay updated 4 years ago
6
keras-rl/keras-rl #185

feature request: Async DDPG (D3PG and D4PG)

I'd love to see an asynchronous version of DDPG on here. Would anyone be able to help me with it? Here are quick thoughts: A3C seems to be king of the hill at the moment, but DDPG has some clear…

M00NSH0T updated 5 years ago
5
hzwer/ICCV2019-LearningToPaint #18

spectral normalization GAN

Have you tried spectral normalization GAN & adding L1 distance to WGAN loss? I wonder how these two changes could impact the performance: ## 1. Replacing WGAN-GP with spectral normalization Spectr…

ThisIsIsaac updated 5 years ago
3
jimkon/Deep-Reinforcement-Learning-in-Large-Discrete-Action-Spaces #31

why is this for continuous action spaces?

Really cool that you've been working on implementing that algorithm in Python. I've been thinking of doing this as well. As far as I can tell, you're the only one that's tried doing this yet, so I'm …

M00NSH0T updated 4 years ago
33
facebookresearch/ReAgent #84

More supported models?

Dear authors, Great work for the excellent. Below are the lists of supported models, which we think some other more methods are also crucial for some applications. Discrete-Action DQN Parametric…

JunchenJin updated 5 years ago
3
openai/spinningup #156

A2C/A3C: don't they use Q-learning?

On this page: https://spinningup.openai.com/en/latest/spinningup/rl_intro2.html More specifically in this diagram: https://spinningup.openai.com/en/latest/_images/rl_algorithms_9_15.svg I am sur…

MasterScrat updated 5 years ago
1
tensorforce/tensorforce #136

Question about PolicyGradientModel.reward_estimation

No matter whether baseline is used, ` PolicyGradientModel.reward_estimation ` computes cumulative rewards in one batch by using ` util.cumulative_discount ` with ` cumulative_start=0.0 ` . In my op…

0xSSoul updated 5 years ago
6
tensorflow/tensorflow #26098

[TF 2.0] optimizer_v2.get_updates() eager mode problem

**TF Version: 2.0.0-dev20190214 Windows 10 Anaconda Python 3.6.5 GPU: GeForce GTX 1070 Max-Q Design [Tensorflow 2.0 (gpu) nightly](https://pypi.org/project/tf-nightly-gpu-2.0-preview/) installed v…

danaugrs updated 5 years ago
11

上一页 1...23 24 25 26 27 28 29...30 下一页

298 results for advantage-actor-critic

298 results
for advantage-actor-critic