actor-critic Search Results

1000+ results
for actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Ekiben542/kikaigakusyuukankei #2

DQN.pyについて

報酬を行動を出力する層に直接入れているため、正しい更新が行われていない。

Torthai updated 5 months ago
16
tianhaowuhz/human-assisting-dex-grasp #1

A bug in rl_eval.sh

Traceback (most recent call last): File "./Runners/EvalGFPPO.py", line 82, in runner.run(num_learning_iterations=iterations, log_interval=cfg_train["learn"]["save_interval"]) File "./Algor…

Norweig1an updated 7 months ago
2
Eclectic-Sheep/sheeprl #226

The algorithm framework differs from the DreamerV1 paper

Hello, I found followig code in `sheeprl/algos/dreamer_v3.py`: ```python # Train the agent if update >= learning_starts and updates_before_training

LYK-love updated 7 months ago
5
AqwamCreates/Roblox-Source-Codes #2

They keep walking in circles

Just letting you know, they keep walking in circles cause you removed the circle thingy

mellomlia updated 6 months ago
9
microsoft/DeepSpeed #4194

[BUG]deepspeed-chat training error on v100 * 8, raise assert…

**Describe the bug** Hi, everybody, I'm traning a llama model in step3 using deepspeed-chat. In version 0.10.1, it raised the following error([see in logs bleow](https://github.com/microsoft/DeepSp…

iamsile updated 7 months ago
21
CreativeNick/SimToReal #1

[Bug/Error] Concatenating EMPTY array "returns" during first…

### Issue When attempting to run `ppo.py` to train the RL model using on `cube_env.py` or the **Bimanual_Allegro_Cube** env, I get an _empty array error_ during Epoch 1 of the iteration loop in `ppo.…

CreativeNick updated 5 months ago
2
Jingliang-Duan/DSAC-v2 #2

Train with GPU

Thank you for your great work! I run through the code but the GPU seems not to be used. Are there any parameters that need to be set? How can I train on GPU?

ZDDWLIG updated 7 months ago
6
openpsi-project/ReaLHF #10

你们论文中的实验有认真调过其他框架的性能么？

OpenRLHF 的调度缺陷在于 OpenRLHF 会有最高一半的GPU 闲置率因为我们同时让所有的模型都放在 GPU 上，这是我们目前没有时间去做全异步训练导致的所以就算极限优化调度性能打满GPU，不考虑更底层的技术优化，只看调度，最多也就让性能翻倍而已我们提供了调优指南：https://github.com/OpenLLMAI/OpenRLHF?tab=readme-ov-file…

hijkzzz updated 5 months ago
3
keean/zenscript #51

new wd-40 test

NodixBlockchain updated 4 years ago
602
mlpack/mlpack #3648

Simplify Reinforcement Learning Agent Creation

### What is the desired addition or change? Simplify the creation of reinforcement learning agents in mlpack by having default values for common parameters, including network architectures and learni…

tareknaser updated 6 months ago
5

上一页 1...92 93 94 95 96 97 98...100 下一页

1000+ results for actor-critic

1000+ results
for actor-critic