actor-critic Search Results

1000+ results
for actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

chainer/chainerrl #64

The Reactor: A Sample-Efficient Actor-Critic Architecture […

The Reactor: A Sample-Efficient Actor-Critic Architecture https://arxiv.org/abs/1704.04651

ethancaballero updated 5 years ago
1
leggedrobotics/rsl_rl #33

[Bug Report] actor's std becomes "nan" during PPO training

I am conducting reinforcement learning for a robot using rsl_rl and isaac lab. While it works fine with simple settings, when I switch to more complex settings (such as Domain Randomization), the foll…

mitsu3291 updated 1 month ago
8
datawhalechina/easy-rl #146

关于书中DDPG算法的疑问

在运行书中DDPG的参考代码时，观察到随着训练的进行，actorloss 的值在不断的上升，criticloss的值也是上下飘忽不定 actorloss 的定义不是-q值吗？现在这个值越来越大不就意味着动作的q值越来越小吗？这不是与我们想最大化动作的q值的目的相反吗？所以评价这个算法的好坏最终是要看他的奖励是否上升吗？

yxz777 updated 1 week ago
1
microsoft/DeepSpeedExamples #525

[bug]AttributeError: 'DeepSpeedHybridEngine' object has no a…

my training environment is a docker image pulled from `deepspeed/deepspeed:v072_torch112_cu117` and i run it with `docker run -it --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --…

qingchu123 updated 6 months ago
4
microsoft/DeepSpeedExamples #622

"RuntimeError: The size of tensor a (5120) must match the si…

I have successfully run step 1 and step 2 and generated the models, but encountered an error when running step 3: "RuntimeError: The size of tensor a (5120) must match the size of tensor b (20480) a…

oolongoo updated 10 months ago
5
dionhaefner/blog-comments #5

2021/04/yahtzotron-learning-to-play-yahtzee-with-advantage-a…

# Learning to play Yahtzee with Advantage Actor-Critic (A2C) | dionhaefner.github.io My in-laws are really into the dice game Yatzy (the Scandinavian version of Yahtzee). If you’re unfamiliar with th…

utterances-bot updated 2 years ago
1
microsoft/DeepSpeedExamples #503

Might be a bug of hibrid engine : In Step3 wrong generation …

When using where using hybrid engine, The output sequence always be 'a a a a ', while if I disabled hybrid engine，the output sequence is correct here is my log with hybrid engine ``` ***** Runn…

laoda513 updated 6 months ago
7
clearlab-sustech/Wheel-Legged-Gym #3

VAE的作用

大佬您好: 我想请教一下vae在actor-critic网络中间起到的作用是什么,去掉之后会怎么样?

Li-Jinyuan updated 3 months ago
1
microsoft/DeepSpeedExamples #705

Step3: 8 * A100-40G training LLAMA2-7B and OPT-350M out of m…

Hi, In step 3, run the following command and getting "OOM" when Initializing Ref Model (Actor Model initialized perfectly): > Actor_Lr=9.65e-6 Critic_Lr=5e-6 deepspeed --master_port 12346 main…

GasolSun36 updated 9 months ago
2
quantuminformation/youtube-space-invaders #11

Add reward/loss over time graphs for actor and critic networ…

The actor reward graph should display both the predicted loss generated by the critic network (equivalent to the actor optimization loss) and the actual loss once the training episode is complete.

generic-github-user updated 5 years ago
2

上一页 1...9 10 11 12 13 14 15...100 下一页

1000+ results for actor-critic

1000+ results
for actor-critic