advantage-actor-critic Search Results

291 results
for advantage-actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

nebuly-ai/optimate #312

[ChatLLaMA] RLHF Training: dimension mismatch

I am getting the following error when doing RLHF training: Traceback (most recent call last): File "/code/main.py", in rlhf_trainer.train() File "/code/trainer.py", in train self.lea…

BigRoddy updated 1 year ago
3
Zeta36/Asynchronous-Methods-for-Deep-Reinforcement-Learning #1

Implement the actor-critic methods

Hello, In the [asynchronous dqn paper](http://arxiv.org/pdf/1602.01783v1.pdf), they also described an on policy method, the advantage actor-critic (A3C), which achieved better results than others, do …

originholic updated 8 years ago
1
ChintanTrivedi/rl-bot-football #2

Eager execution function cannot be Keras symbolic tensors

Hi, @ChintanTrivedi I am using the modified version of your code to train the environment created using the Unity engine. [I have modified the code to handle this]. Action space = Continuous Obse…

dhyeythumar updated 2 years ago
6
pytorch/rl #338

[Feature Request] Purely functional loss objectives

## Motivation ### 1. Consistent style for `torch.nn.modules.loss.*Loss` In `torch.nn.modules.loss`, there are many `*Loss` subclassing `nn.Module`. The `Loss.__init__()` does not takes other `nn…

XuehaiPan updated 1 year ago
4
number9473/nn-algorithm #246

Asynchronous Methods for Deep Reinforcement Learning

# Asynchronous Methods for Deep Reinforcement Learning # - Author: Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuo…

joyhuang9473 updated 6 years ago
2
pytorch/rl #2265

[BUG] A2C fails with functional=True and shifted=True for Va…

## Describe the bug Not quite sure if this is supported behavior, but if I set `functional=True` for the A2C loss and `shifted=True` for `TD0Estimator` I get an internal error. ## To Reproduce …

jkrude updated 1 month ago
5
princewen/tensorflow_practice #28

你给的AC代码有一个错误！

#代码中你说用的 td_error 的 actor-critic 算法，但实际算actor的gradient时，你用的是q而不是td_error, 修改如下 def learn(self, s, a, r, s_): s, s_ = s[np.newaxis, :], s_[np.newaxis, :] next_a = [[i] for i in r…

clicdl updated 5 years ago
1
stan-his/DeepFMPO #3

Some inconsistencies between code and paper

I have been digging into your paper and code, and noticed some potential discrepancies between the paper and the code. I would appreciate it very much if you could clarify. 1) in **training.py** line…

MherMatevosyan updated 4 years ago
1
datawhalechina/easy-rl #59

/chapter9/chapter9_questions&keywords

https://datawhalechina.github.io/easy-rl/#/chapter9/chapter9_questions&keywords Description

qiwang067 updated 1 year ago
6
dennybritz/reinforcement-learning #174

[bug] DQN/dqn.py: Incorrect loss function. [question] Questi…

First of all - thank you very much for this repository! You have made diving into Reinforcement Learning easier! About the issue: I think you should use huber_loss instead of square_difference. Loo…

Kropekk updated 5 years ago
5

上一页 1...1 2 3 4 5 6 7...30 下一页

291 results for advantage-actor-critic

291 results
for advantage-actor-critic