qwen2_vl推理报错 - Githubissues

will-wiki commented 4 weeks ago

qwen2_vl-7b模型，A800 80G显存图片采样使用llamafactory的448最长边等宽高比resize，视频最大抽帧数量为32

官方推理代码https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct 问题1:随机采样改成 num_beams=2推理部分数据就会报错OOM，这个正常吗 torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 GiB. GPU 0 has a total capacty of 79.33 GiB of which 19.10 GiB is free. Including non-PyTorch memory, this process has 60.22 GiB memory in use. Of the allocated memory 51.91 GiB is allocated by PyTorch, and 7.80 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

问题2:还有一个问题，对于sft之后的模型，num_beams=2的采样结果有些还比较离谱不如sample采样效果好，这个不知道是我训练问题还是什么，按理sample效果正常的话，num_beams的效果不应该更好么？

will-wiki commented 3 weeks ago

hello，想请教下上面的两个问题，有人方便解答一下吗

will-wiki commented 3 weeks ago

@simonJJJ @ShuaiBai623 大佬帮忙看下问题呢，感谢🌹

will-wiki commented 2 weeks ago

@simonJJJ @ShuaiBai623 https://github.com/hiyouga/LLaMA-Factory/issues/5477 按照这个issue提到的配置，修改1D-RoPE为MRoPE训练了一版模型，sft之后的模型还是num_beams=2的采样结果不如sample采样效果好，有些甚至还比较离谱，这个是为什么呢？

will-wiki commented 2 weeks ago

试了官方的Qwen2-VL-7B-Instruct模型，依旧是num_beams=2的效果不如sample，看起来是模型问题

QwenLM / Qwen2-VL

qwen2_vl推理报错 #205