llm-training Search Results

1000+ results
for llm-training

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

modelscope/ms-swift #2246

Finetuning Qwen2VL yield error when enabling FlashAttention …

**Describe the bug** When using Flash Attention (--use-flash-attention true) to train Qwen2VL model with mixed data (both image and text data), the code will yield the following error ``` [rank0]: …

VietDunghacker updated 1 week ago
7
NVIDIA/Megatron-LM #1166

[BUG] "ValueError: optimizer got an empty parameter list" un…

### Describe the bug I encountered a `ValueError: optimizer got an empty parameter list`, when training the projector of a LLaVA archtectire LMM model, with the pipeline parallel size set to **2 or hi…

takuya576 updated 1 month ago
1
RylanSchaeffer/AstraFellowship-When-Do-VLM-Image-Jailbreaks-Transfer #25

Phi-3-Based VLMs Not Usable Possibly Due to Incorrect Model …

Hi, Thank you for your great work! I've been trying to use the Phi-3-Instruct-4B VLM models, but encountered several issues: - Incorrect LLM backbone choice in phi.py: https://github.com/R…

Qinyu-Allen-Zhao updated 1 month ago
2
NiuTrans/Vision-LLM-Alignment #17

result of mmbench after dpo

Will the mmbench test set score drop after dpo? Does this repo supports dpo without another reward model loaded?

luohao123 updated 1 week ago
3
ggerganov/llama.cpp #10113

Bug: [SYCL] SYCL + Docker

### What happened? I can't use docker + SYCL when using -ngl >0 With -ngl 0 it's ok message error : No kernel named _ZTSZZL17rms_norm_f32_syclPKfPfiifPN4sycl3_V15queueEiENKUlRNS3_7handlerEE0_c…

easyfab updated 3 weeks ago
4
2U1/Phi3-Vision-Finetune #26

Timeout during finetuning

Hi, Thanks for sharing the code. I'm using it to fine-tune on videos by freezing the visual encoder and projector, and tuning the LLM. Initially, everything works well, but as training progresses, …

fangruizhu updated 4 weeks ago
9
AkihikoWatanabe/paper_notes #1380

Reflection-Tuning: Data Recycling Improves LLM Instruction-T…

# URL - https://arxiv.org/abs/2310.11716v1 # Affiliations - Ming Li, N/A - Lichang Chen, N/A - Jiuhai Chen, N/A - Shwai He, N/A - Heng Huang, N/A - Jiuxiang Gu, N/A - Tianyi Zhou, N/A # …

AkihikoWatanabe updated 1 month ago
1
jiahe7ay/MINI_LLM #30

采用4090多卡预训练的时候，出现权重文件保存错误

当运行sh train.sh pre_train.py时候，我采用8卡来运行脚本，但出现 Saving model checkpoint to ./model_save/pre/tmp-checkpoint-50 Configuration saved in ./model_save/pre/tmp-checkpoint-50/config.json Configuration saved …

2279072142 updated 4 months ago
2
ServiceNow/Fast-LLM #27

Roadmap

This is a tentative roadmap for major improvements to fast LLM. It includes big features and potential breaking changes, but excludes minor features and additions. It goes in several parts, with th…

jlamypoirier updated 4 weeks ago
1
descriptinc/descript-audio-codec #77

Seeking Guidance on Integrating DAC into LLM Training

I have been trying to integrate the DAC codes into LLM training. However, I encountered challenges in achieving satisfactory predictions with LLMs, such as VALLE. Has anyone, including the authors, su…

lixucuhk updated 4 months ago
2

上一页 1...8 9 10 11 12 13 14...100 下一页

1000+ results for llm-training

1000+ results
for llm-training