actor-critic Search Results

1000+ results
for actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

GiacomoPracucci/RL-edge-computing #1

the paper corresponding to this code

Hi author, where can I view the paper corresponding to this code?

SuperLuckyStar666 updated 5 months ago
1
Zhehui-Huang/quad-swarm-rl #51

Model size mismatch

"I downloaded a model (Multi drone without obstacles) from the following URL for testing: https://huggingface.co/andrewzhang505/quad-swarm-rl-multi-drone-no-obstacles/tree/main. When I executed th…

CowFromSpace updated 7 months ago
1
opendilab/DI-engine #791

get "TypeError: __init__() got an unexpected keyword argumen…

root@I196082a51d0070168c:/hy-tmp/DI-engine/dizoo/smac/config# python3 -u smac_5m6m_masac_config.py [04-15 20:32:26] WARNING If you want to use numba to speed up segment tree, please install numba f…

SiriusZbz updated 7 months ago
2
OpenSPG/KAG #23

ZeroDivisionError: float division by zero

buildKB successfully for /teamspace/studios/this_studio/KAG/kag/examples/hotpotqa/builder/./data/hotpotqa_sub_corpus.json parallelQaAndEvaluate completing: 0%| | 0/2 [00:00

PimelY567 updated 3 weeks ago
3
OpenRLHF/OpenRLHF #275

CUDA out of memory when i run train_ppo_llama_ray.sh on 4 RT…

My configuration: `ray job submit --address="http://127.0.0.1:8265" \ --runtime-env-json='{"working_dir": "/openrlhf", "pip": "/openrlhf/requirements.txt"}' \ -- python3 examples/train_ppo_…

libowen424 updated 7 months ago
2
araffin/sbx #44

[Bug] TQC Hyperparameter optimization: Results do not match …

### 🐛 Bug Hi, When I try to run TQC hyperparameter optimization with multiple jobs (n-jobs>1) with a GPU (this also happens with multiple CPU cores and n-jobs=1), it gives me this error: ``` …

edmund735 updated 4 months ago
2
Lightning-AI/pytorch-lightning #17799

Lightning CLI fails to start in virtual environment followin…

### Bug description This issue is apparent when attempting to run the following tutorial: https://lightning.ai/pages/community/tutorial/how-to-train-reinforcement-learning-model-to-play-game-using-…

Kaszanas updated 3 months ago
10
OpenRLHF/OpenRLHF #277

内存超出问题

使用PPO训练13B的模型，内存占用特别高，我应该怎么解决

burger-pb updated 4 months ago
3
DLR-RM/stable-baselines3 #1919

[Question] How to access to rollout (logger) data in callbac…

### ❓ Question I'm using a `custom gym env` with multi envs, and I want to write a customized callback function related to `StopTrainingOnRewardThreshold`, one difference is **I shall use `"rollout/e…

JaimeParker updated 6 months ago
2
DigiRL-agent/digirl #8

Why use Auto-UI instead of larger VLM?

Why do you only train Auto-UI? Auto-UI seems to me a traditional RL policy model, not a LLM/VLM agent.

SiyuanMaCS updated 4 months ago
7

上一页 1...87 88 89 90 91 92 93...100 下一页

1000+ results for actor-critic

1000+ results
for actor-critic