-
I'm working on DRL framework using the PPO agent with Torch and experienced a difference in how observation spaces are handled. The example in the [documentation](https://xuance.readthedocs.io/en/late…
-
Hello everyone, I am deploying “SERL” in the real world and I fully follow the instructions on the webpage for hardware and software deployment.
I first want to reproduce task 1: peg insertion
Here …
iu777 updated
5 months ago
-
```
deepspeed ./train_ppo.py \
--pretrain OpenLLMAI/Llama-2-7b-sft-model-ocra-500k \
--reward_pretrain OpenLLMAI/Llama-2-7b-rm-anthropic_hh-lmsys-oasst-webgpt \
--save_path ./ckpt/7b_l…
-
Hi
I notice you cite "70B+ Full Tuning with 16 A100" however this is also something that trlX (and that we worked very hard to add ;) ) supports via NeMo support. Similarly, this is something that …
-
# When I modify and run this code, new problems appear. It seems to be because the code is incomplete. Did I miss something?
run ' python train/grasp_decision.py'
## error1
```
03/06/2024 10:39:0…
-
Fifty thousand words, huh? I do fear that the plot will begin to suffer partway through no matter _how_ cleverly I code, but I'll give it a whirl.
-
**Describe the bug**
'ValueError: Expected parameter loc (Tensor of shape (32880, 12)) of distribution Normal(loc: torch.Size([32880, 12]), scale: torch.Size([32880, 12])) to satisfy the constraint R…
-
### ❓ Question
Hi, everyone
I would like to set a learning rate and scheduler for the feature extractor that differs from those of the actor-critic networks. Is there a way to do this?
### Chec…
-
The depth data is not included in the data set you shared, but this error message appeared during the training process.
Training started from: 2020-09-03 10:23:59
Scene Data Exists!
initialized o…
-
- [x] I have marked all applicable categories:
+ [x] exception-raising bug
+ [x] RL algorithm bug
+ [ ] documentation request (i.e. "X is missing from the documentation.")
+ [ ] ne…