-
**Describe the bug**
When using Flash Attention (--use-flash-attention true) to train Qwen2VL model with mixed data (both image and text data), the code will yield the following error
```
[rank0]: …
-
### Describe the bug
I encountered a `ValueError: optimizer got an empty parameter list`, when training the projector of a LLaVA archtectire LMM model, with the pipeline parallel size set to **2 or hi…
-
Hi,
Thank you for your great work!
I've been trying to use the Phi-3-Instruct-4B VLM models, but encountered several issues:
- Incorrect LLM backbone choice in phi.py:
https://github.com/R…
-
Will the mmbench test set score drop after dpo? Does this repo supports dpo without another reward model loaded?
-
### What happened?
I can't use docker + SYCL when using -ngl >0
With -ngl 0 it's ok
message error :
No kernel named _ZTSZZL17rms_norm_f32_syclPKfPfiifPN4sycl3_V15queueEiENKUlRNS3_7handlerEE0_c…
-
Hi,
Thanks for sharing the code. I'm using it to fine-tune on videos by freezing the visual encoder and projector, and tuning the LLM. Initially, everything works well, but as training progresses, …
-
# URL
- https://arxiv.org/abs/2310.11716v1
# Affiliations
- Ming Li, N/A
- Lichang Chen, N/A
- Jiuhai Chen, N/A
- Shwai He, N/A
- Heng Huang, N/A
- Jiuxiang Gu, N/A
- Tianyi Zhou, N/A
# …
-
当运行sh train.sh pre_train.py时候,我采用8卡来运行脚本,但出现
Saving model checkpoint to ./model_save/pre/tmp-checkpoint-50
Configuration saved in ./model_save/pre/tmp-checkpoint-50/config.json
Configuration saved …
-
This is a tentative roadmap for major improvements to fast LLM. It includes big features and potential breaking changes, but excludes minor features and additions.
It goes in several parts, with th…
-
I have been trying to integrate the DAC codes into LLM training. However, I encountered challenges in achieving satisfactory predictions with LLMs, such as VALLE. Has anyone, including the authors, su…