-
### Required prerequisites
- [X] I have read the documentation .
- [X] I have searched the [Issue Tracker](https://github.com/PKU-Alignment/align-anything/issues) and [Discussions](https://github.com…
-
When creating a PEFT model and then trying to train it, we get an error;
```
File "/scratch/gpfs/ashwinee/unsloth/unsloth/kernels/fast_lora.py", line 106, in backward
d_do…
-
It looks like llama.cpp now [supports openbmb/MiniCPM-Llama3-V-2_5.](https://github.com/ggerganov/llama.cpp/pull/7599)
Here's the [official gguf.](https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_…
-
按自定义数据格式,训练DPO在Map时报错
File "ms-swift/swift/trainers/dpo_trainer.py", line 114, in tokenize_row
if len(answer_tokens['prompt_input_ids']) + longer_response_length > self.max_length:
KeyError: '…
-
Hi I am getting this error loading the DPO dataset, does anyone know how to resolve it? Thank you!
I have this error even when my pandas version is 2.2.2
> >>> pd.read_parquet("code/eagle-dev/R…
-
RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
https://arxiv.org/abs/2405.17220
-
https://github.com/RLHF-V/RLAIF-V/blob/main/muffin/data/data_processors.py#L97
The function is not loaded or defined.
Also, gather_data_files_by_glob function may not match the parquet format of o…
-
I did a DPO fine-tuning using the default MP command provided [here](https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Multi-Modal/human-preference-alignment-training-documentation.md#dp…
-
非常感谢您的开源,有问题想请教:
![image](https://github.com/RLHF-V/RLAIF-V/assets/30074778/e27abcdd-26a0-4938-9647-cf4f3dd53613)
请问一下ref_win_logp这些是标注里面存的预先算出来的吗?RLAIF-V-Dataset里面貌似没有看到呢,有直接可用的数据可以参考吗?感谢
-
Hi 2 quick questions,
1. From the paper algorithm1, I get a sense that the algorithm can work in an online divide-n-conquer manner with updated model and I am just curious when the self-feedback co…