-
### ❓ Question
can anyone explain to me how can I change the default actor and critic network to my network?
I have done this. Step by step of implementation:
1. created a custom network
2. def …
-
Hi,
thanks for the amazing work of RL environments using JAX. I was wondering if you have any plans to write Actor-Critic agents for this work?
-
If I understand the current PPO code correctly, this instantiates completely separate actor and critic models, without any layers shared between them. (But correct me in case that is wrong?)
Instea…
-
运行ppo_ray训练qwen2 72B的时候经常会报错
![image](https://github.com/user-attachments/assets/b55ab8cc-c8fa-40ba-8aa2-4bed3938e756)
运行脚本关键参数如下,已使用官方推荐decker:
ray job submit --address="http://127.0.0.1:8265" \
…
-
Also, is your code based on the paper with new modifications, the code involves A2C-like strategies that don't seem to be presented in the paper, which is a bit unclear to me. I hope you can help.
-
Actor and critic loss are very high and become nan after a few training steps.
@martiny76
-
Hello, thank you for your wonderful work.
I'm seeking a way to take the choreography for music in the wild as a .fbx format.
For this, I have tried to convert the 3D position into SMPL parameter, us…
-
Reduce duplication of similar Actors/Critics with only the hidden layers being different - generally improve the readability of the code for creating the networks.
-
Fairseq contains many NMT models but models with Reinforcement Learning are absent.
It would be great if that is added
-