-
**Describe the bug**
When using deepspeed-chat RLHF on ROCM/AMD, it crashes if I use bf16 (fp16 works on AMD, both work on NVIDIA). This seems to be because enable_bf16 is never set in op_builder/bui…
-
Hi, @xujz18 @Xiao9905
Thanks for this nice contribution. I noticed that we can load ImageReward data with:
`datasets.load_dataset("THUDM/ImageRewardDB", "8k")`
However, the loaded data seem to…
-
Type: Bug
I have a directory about 100000 files total 3T size.
When I right click it and select "delete permanently", vscode terminal get hang. And later, remote ssh connection is broken. I then r…
-
### Describe the bug
When using the Dataset.to_json() function, an unexpected error occurs if the parameter is set to lines=False. The stored data should be in the form of a list, but it actually tur…
-
When training the ppo model, I turned on the gradient_checkpointing_enable. If you want to calculate ptx loss, then actor will forward twice. In your code, these two loss are executed backward once se…
-
> [rank0]: Traceback (most recent call last):
> [rank0]: File "/opt/tmp/nlp/wzh/LLM-Dojo/rlhf/rloo_train.py", line 167, in
> [rank0]: trainer.train()
> [rank0]: File "/home/nlp/miniconda3/…
-
AutoModelForCausalLM 中class没有chatglm你是如何解决的呢
-
- [ ] Why the author only compare RLAIF with RLHF on task of summarization?
- [ ] How are the performances for other tasks?
- [ ] For 4.1 Datasets, what other ways OpenAI use to filter the data?
- …
-
Can this approach be used to create a nano-sized `text-davinci-003`?
-
### 🚀 The feature, motivation, and pitch
I think trlx should support XGLM model for training PPO because XGLM has support 134 languages in [XGLM-4.5B model](https://huggingface.co/facebook/xglm-4.5B)…