-
If I understand the current PPO code correctly, this instantiates completely separate actor and critic models, without any layers shared between them. (But correct me in case that is wrong?)
Instea…
-
-
## Fix the model test for `soft_actor_critic.py`
1. setup env according to [Run a model under torch_xla2](https://github.com/pytorch/xla/blob/master/experimental/torch_xla2/docs/support_a_new_model…
-
scripts/config/main/webrl.yaml:
defaults:
- default
- _self_
save_path: /workspace/WebRL/scripts/output
run_name: "webrl"
critic_lm# training
policy_lm: /workspace/WebRL/webrl-glm-4-9…
-
This Bug seem interesting.
it's occur in the Clothing Simulation > Mesh 1/2 > Inner radius + Top / Bottom Anchor Lock.
the Inner Radius seem to be randomly change the Bug some Higher Value the B…
-
**Describe the bug**
Loading the save for the initial d-day section during the landing craft ride the audio hisses on the left channel then the right channel before looping back to the left channel. …
-
# Use Case 579
## 3D model to store volumetric survey plans
As a user of GeoSPARQL data, I need a 3D model to accurately store information from volumetric survey plans and to conduct analyses fo…
-
TL;DR: The use of `context.performAndWait` can introduce deadlocks into programs which would not deadlock with normal actors.
## Background
I was excited to use this library to solve some threadin…
-
The model I use is GPT-2 124M. When resizing model embeddings during the training of STF and RW, I often encounter issues where the generated answers consist entirely of zeros. This causes both the lo…
-
For example, reward model has 8 GPU cards with TP and DP configuration. Actor model might have TP&PP&DP(just for example) occupying 64 GPUs.
How do you connect Actor's last stage's output to reward m…