advantage-actor-critic Search Results

291 results
for advantage-actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/examples #151

A3C instead of actor-critic in reinforcement_learning/reinf…

There is the code of reinforce.py `for action, r in zip(self.saved_actions, rewards): action.reinforce(r)` And there is the code of actor-critic.py: ` for (action, value), r in zi…

susht3 updated 2 years ago
1
DongChen06/MARL_CAVs #43

Training data and evaluation data

Hello! I noticed that the maximum eposides can be controlled by MAX_EPISODES during training, and EVAL_INTERVAL determines the evaluation intervals; however, the evaluation process seems to determi…

zcysun updated 1 month ago
10
alex-petrenko/sample-factory #274

State-action value function

Hello, I had a quick question about the form of the value function. Right now by default it is an action value function with a linear layer that receives the output of the decoder. I was wondering …

paLeziart updated 11 months ago
5
xbpeng/DeepMimic #81

Is DeepMimic be trained using A3C or A2C?

A3C: aka Asynchronous Advantage Actor Critic It uses MPI, so I wonder if DeepMimic be trained using A3C?

Zju-George updated 4 years ago
1
KhoiDOO/some #12

Problems of MARL - PPO:

skydvn updated 1 year ago
1
kkspeed/chess #4

Implement Actor Critic Agent

Implement and explore the effectiveness of actor critic agent.

kkspeed updated 5 years ago
1
tejank10/Flux-baselines #1

Code review

Hey @MikeInnes, if you are back could you please review the code? New models which I have added are Dueling DQN, Advantage Actor-Critic, and DDPG. Also, all the previous work done on DQN is added to d…

tejank10 updated 5 years ago
9
xbpeng/awr #2

Parameters used for motion imitation

Hello, I am trying to use this algorithm (rewritten in PyTorch with Gym vectorized envs) for motion imitation, starting with the PyBullet implementation of the DeepMimic environment. In the paper, …

ManifoldFR updated 4 years ago
6
keras-team/keras-io #194

Possible issue of gradients calculation in actor_critic_cart…

In this example https://github.com/keras-team/keras-io/blob/master/examples/rl/actor_critic_cartpole.py, the gradient for the actor is defined as the gradient of loss $L = \sum \ln\pi (reward-value)$.…

refraction-ray updated 3 months ago
5
nebuly-ai/optimate #312

[ChatLLaMA] RLHF Training: dimension mismatch

I am getting the following error when doing RLHF training: Traceback (most recent call last): File "/code/main.py", in rlhf_trainer.train() File "/code/trainer.py", in train self.lea…

BigRoddy updated 1 year ago
3

上一页 1...1 2 3 4 5 6 7...30 下一页

291 results for advantage-actor-critic

291 results
for advantage-actor-critic