Open junwenxiong opened 4 weeks ago
图片过大, batch size 过大 等
我用的是代码中数据,没有更换呀,有可能是flash-attention的问题吗,我用的是2.3.1的版本
I am also getting an out of memory error when trying to finetune Qwen2-VL-2B-Instruct on six A6000s (48GB x 6). I am using flash attention 2. My batch_size=1, min_pixels=256x28x28, and max_pixels=512x28x28. I am training on eight videos, though, which are 1920 x 1080 pixels and are eight seconds long.
replace from torch.optim import AdamW with from torch.optim import SGD
That helped by allowing more batches to run than previously but I still eventually ran out of memory. Is there a way to lower the gpu memory utilization? See https://github.com/vllm-project/vllm/issues/2554
Perhaps gradient accumulation steps=1 will help?
您好,我用4张A100微调Qwen2-VL-7B-Instruct模型,但还是会出现OOM,整个代码库都没有改变呀,咋会出现这个情况呢?