-
This project is great and the dataset is unique. To provide help to the community, it will be a great idea to support PeFT training on this dataset. Also, there's a chance to increase the training to …
-
Here is the error I met, seems like the `self._total_batch_size` is `None`, but I don't know the reason
```
File "/path/model_training/DeepSpeed-Chat/training/step3_rlhf_finetuning/main.py", lin…
-
Use Llama2 model and train on all latest and more efficient (like SlimPajama vs redpajama) open datasets ?
Just for the base model, then maybe open-assistant team can rlhf it
-
Hello! Did anyone meet the following bug when using zero_stage3 for Lllama2?
step3_rlhf_finetuning/rlhf_engine.py:61 in __init__ │
│ …
-
Do you use LoRA for the step3 RLHF benchmarks in https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/training/step3_rlhf_finetuning/BenckmarkSetting.md ? Or are you …
-
and how to install alpaca-rlhf
-
### 请提出你的问题
后面会开发ppo和reward模型的训练方法吗
-
**System Info:**
Memory: 500G
GPU: 8 * A100 80G
Question:
**Why using multi gpus in init of DeepSpeedRLHFEngine used much more memroy compared to using single gpu ?**
**Reproduce:**
Copy mode…
-
# URL
- https://arxiv.org/abs/2203.02155
# Affiliations
- Long Ouyang, N/A
- Jeff Wu, N/A
- Xu Jiang, N/A
- Diogo Almeida, N/A
- Carroll L. Wainwright, N/A
- Pamela Mishkin, N/A
- Chong …
-
The following error occurred while running cell 10 in **6. Tune language model using PPO with our preference model**.
After adding `__init__.py` to `/content/trlx/examples/summarize_rlhf/reward_model…