actor-critic Search Results

1000+ results
for actor-critic

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

thu-ml/tianshou #1142

How can I make action sampling within the range specified by…

Hi, I am new to tianshou and RL. I created a env and used ppo in tianshou to run. But I found the action sampling is out of range. So I searched for, and I found map_action. But it seem not used in tr…

lidaken updated 4 months ago
6
microsoft/DeepSpeedExamples #448

[DeepSpeedExamples/applications/DeepSpeed-Chat/] Error happe…

Error info: File "/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/hybrid_engine.py", line 99, in new_inference_container File "/opt/conda/lib/python3.8/site-packages/deepspeed/module_…

GxjGit updated 9 months ago
2
OpenRLHF/OpenRLHF #246

Unexpected long actor_time when train_ppo_ray

训练配置如下： ``` --ref_num_nodes 1 --ref_num_gpus_per_node 2 --reward_num_nodes 1 --reward_num_gpus_per_node 2 --critic_num_nodes 1 --critic_num_gpus_per_node 4 --actor_num_nodes 2 --actor_num_gpus_per_n…

LSC527 updated 5 months ago
9
OpenRLHF/OpenRLHF #295

QLORA model loading error

Hi team getting the following error while enabling 4-bit and LORA ``` File "/root/miniconda3/envs/open/lib/python3.11/site-packages/deepspeed/runtime/engine.py", line 262, in __init__ self._c…

karthik-nexusflow updated 3 months ago
5
yandexdataschool/Practical_RL #398

Do something about the actor-critic Coursera assignment

Currently, `week5_policy_based/practice_a3c.ipynb` has numerous problems. * It does not implement A3C. It is a plain actor-critic. * We only have it in Tensorflow, since it does not have a corresp…

dniku updated 1 year ago
4
isaac-sim/IsaacLab #673

[Bug Report] with rsl_rl, actor's std becomes "nan" during P…

I am conducting reinforcement learning for a robot using rsl_rl and isaac lab. While it works fine with simple settings, when I switch to more complex settings (such as Domain Randomization), the foll…

mitsu3291 updated 3 weeks ago
17
songwenas12/fjsp-drl #8

actor 和 critic 网络没有forward 是如何训练的？看不懂

请问在PPO_model.py 文件里，forward 是空的，为什么可以通过evaluate 函数实现呢？实在没搞懂这样的话HGNNScheduler 网络里的 actor 和 critic 是怎么训练的？ evaluate 函数，里面使用了 actor 和 critic ，那actor 网络的含义是什么啊？初始化的输出只有1维，如何输出 action的分布呢？是通过里面的…

Eosshao updated 5 months ago
1
openai/baselines #131

Zero gradient from clipping critic output in DDPG?

The actor loss is defined as ([Line 164](https://github.com/openai/baselines/blob/699919f1cf2527b184f4445a3758a773f333a1ba/baselines/ddpg/ddpg.py#L164)): ```python self.actor_loss = -tf.reduce_mea…

danijar updated 6 years ago
4
araffin/sbx #40

[Feature Request] Recurrent policies

There are recurrent (LSTM) policy options for sb3 (e.g. [RecurrentPPO](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/sb3_contrib/ppo_recurrent/ppo_recurrent.py)). It w…

jamesheald updated 1 week ago
12
pranz24/pytorch-soft-actor-critic #35

Resume training

Hello I am trying to use the SAC agent and resume training, to do that I do: ``` def load_model(actor_path, critic_path, optimizer_actor_path, optimizer_critic_path, optimizer_alpha_path): po…

Tomeu7 updated 2 years ago
5

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for actor-critic

1000+ results
for actor-critic