-
Hi 2 quick questions,
1. From the paper algorithm1, I get a sense that the algorithm can work in an online divide-n-conquer manner with updated model and I am just curious when the self-feedback co…
-
**Describe the bug**
Getting the following error only by changing the model to `glm4v-9b-chat` from `llava1_6-mistral-7b-instruct` in the first DPO example [here](https://github.com/modelscope/ms-swi…
-
## Why
推薦・機械学習勉強会は、推薦や機械学習、その周辺技術を通じてサービスを改善することにモチベーションのある人達の集まりです。ニュースやブログから論文まで、気になったものについてお互い共有しましょう!
発信のため、ここは **public** にしてあります。外部からの参加をご希望の方は樋口(https://twitter.com/zerebom_3) まで DM を送るか、…
-
### Description
Figure 1 badly rendered.
### (Optional:) Please add any files, screenshots, or other information here.
_No response_
### (Required) What is this issue most closely related to? Sele…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### YOLOv8 Component
_No response_
### Bug
…
-
### System Info
transformers version: 4.35.2
Platform: Linux-5.15.0-1050-aws-x86_64-with-glibc2.31
Python version: 3.10.12
Huggingface_hub version: 0.20.2
Safetensors versio…
-
Is it possible for validation metrics to return non-boolean answers?
This could bring a lot more control over the optimization strategy, e.g. loss weighting for different metrics.
Another idea i…
-
请问RLAIF训练细节会开源吗 谢谢
-
### 🐛 Describe the bug
I'm running following deepspeed command for finetuning in my venv:
`deepspeed trainer_sft.py --configs llama2-7b-sft-RLAIF --wandb-entity tammosta --show_dataset_stats --dee…
-
### System Info
transformers version: 4.35.2
Platform: Linux-5.15.0-1050-aws-x86_64-with-glibc2.31
Python version: 3.10.12
Huggingface_hub version: 0.20.2
Safetensors versio…