modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
4.29k stars 377 forks source link

V100 finetune and infer qwen2-vl-2b-instruct error #2113

Closed TristaCheng2018 closed 1 month ago

TristaCheng2018 commented 1 month ago

V100 finetune and infer qwen2-vl-2b-instruct error,32G显存也会报资源不够

[INFO:swift] Using environment variable MAX_PIXELS, Setting max_pixels: 102400. 0%| | 0/497 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/swift/cli/infer.py", line 5, in infer_main() File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/swift/utils/run_utils.py", line 32, in x_main result = llm_x(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/swift/llm/infer.py", line 549, in llminfer response, = inference( ^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/swift/llm/utils/utils.py", line 802, in inference generate_ids = model.generate(streamer=streamer, generation_config=generation_config, inputs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/peft/peft_model.py", line 1638, in generate outputs = self.base_model.generate(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/transformers/generation/utils.py", line 2015, in generate result = self._sample( ^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/transformers/generation/utils.py", line 2965, in _sample outputs = self(model_inputs, return_dict=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1594, in forward image_embeds = self.visual(pixel_values, grid_thw=image_grid_thw).to(inputs_embeds.device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1031, in forward hidden_states = self.patch_embed(hidden_states) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 243, in forward hidden_states = self.proj(hidden_states.to(dtype=target_dtype)).view(-1, self.embed_dim) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 608, in forward return self._conv_forward(input, self.weight, self.bias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/powerop/envs/swift_lst/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 603, in _conv_forward return F.conv3d( ^^^^^^^^^ RuntimeError: CUDA error: too many resources requested for launch CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Jintao-Huang commented 1 month ago

https://github.com/modelscope/ms-swift/issues/1867