$ python3 qwen2_vl_2b.py
[INFO:swift] Successfully registered `/home/ps/Github/swift/swift/llm/data/dataset_info.json`
[INFO:swift] No LMDeploy installed, if you are using LMDeploy, you will get `ImportError: cannot import name 'prepare_lmdeploy_engine_template' from 'swift.llm'`
template_type: qwen2-vl
[INFO:swift] Downloading the model from ModelScope Hub, model_id: qwen/Qwen2-VL-2B-Instruct
[WARNING:modelscope] Using branch: master as version is unstable, use with caution
[INFO:swift] Loading the model using model_dir: /home/ps/.cache/modelscope/hub/qwen/Qwen2-VL-2B-Instruct
Unrecognized keys in `rope_scaling` for 'rope_type'='default': {'mrope_section'}
[INFO:swift] model_kwargs: {'device_map': 'auto'}
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
`Qwen2VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.08s/it]
[INFO:swift] model.max_model_len: 32768
[INFO:swift] Global seed set to 42
[INFO:swift] Using environment variable `SIZE_FACTOR`, Setting size_factor: 8.
[INFO:swift] Setting resized_height: None. You can adjust this hyperparameter through the environment variable: `RESIZED_HEIGHT`.
[INFO:swift] Setting resized_width: None. You can adjust this hyperparameter through the environment variable: `RESIZED_WIDTH`.
[INFO:swift] Setting min_pixels: 3136. You can adjust this hyperparameter through the environment variable: `MIN_PIXELS`.
[INFO:swift] Using environment variable `MAX_PIXELS`, Setting max_pixels: 602112.
query: <img>http://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/road.png</img>距离各城市多远?
response: 这张图片显示了从马踏到阳江的距离是14公里,从阳江到广州的距离是62公里,从广州到马踏的距离是293公里。
query: 距离最远的城市是哪?
response: 距离最远的城市是广州,从马踏到广州的距离是293公里。
history: [['<img>http://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/road.png</img>距离各城市多远?', '这张图片显示了从马踏到阳江的距离是14公里,从阳江到广州的距离是62公里,从广州到马踏的距离是293公里。'], ['距离最远的城市是哪?', '距离最远的城市是广州,从马踏到广州的距离是293公里。']]
双卡测试结果
将测试脚本中os.environ['CUDA_VISIBLE_DEVICES'] 设置为 0,1。
$ python3 qwen2_vl_2b.py
[INFO:swift] Successfully registered `/home/ps/Github/swift/swift/llm/data/dataset_info.json`
[INFO:swift] No LMDeploy installed, if you are using LMDeploy, you will get `ImportError: cannot import name 'prepare_lmdeploy_engine_template' from 'swift.llm'`
template_type: qwen2-vl
[INFO:swift] Downloading the model from ModelScope Hub, model_id: qwen/Qwen2-VL-2B-Instruct
[WARNING:modelscope] Using branch: master as version is unstable, use with caution
[INFO:swift] Loading the model using model_dir: /home/ps/.cache/modelscope/hub/qwen/Qwen2-VL-2B-Instruct
Unrecognized keys in `rope_scaling` for 'rope_type'='default': {'mrope_section'}
[INFO:swift] model_kwargs: {'device_map': 'auto'}
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
`Qwen2VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.10s/it]
[INFO:swift] model.max_model_len: 32768
[INFO:swift] Global seed set to 42
[INFO:swift] Using environment variable `SIZE_FACTOR`, Setting size_factor: 8.
[INFO:swift] Setting resized_height: None. You can adjust this hyperparameter through the environment variable: `RESIZED_HEIGHT`.
[INFO:swift] Setting resized_width: None. You can adjust this hyperparameter through the environment variable: `RESIZED_WIDTH`.
[INFO:swift] Setting min_pixels: 3136. You can adjust this hyperparameter through the environment variable: `MIN_PIXELS`.
[INFO:swift] Using environment variable `MAX_PIXELS`, Setting max_pixels: 602112.
Traceback (most recent call last):
File "/home/ps/Github/AiVl/scripts/qwen2_vl_2b.py", line 24, in <module>
response, history = inference(model, template, query)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ps/Github/swift/swift/llm/utils/utils.py", line 864, in inference
generate_ids = model.generate(streamer=streamer, generation_config=generation_config, **inputs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/transformers/generation/utils.py", line 2053, in generate
result = self._sample(
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/transformers/generation/utils.py", line 3040, in _sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
$ python3 qwen2_vl_2b.py
[INFO:swift] Successfully registered `/home/ps/Github/swift/swift/llm/data/dataset_info.json`
[INFO:swift] No LMDeploy installed, if you are using LMDeploy, you will get `ImportError: cannot import name 'prepare_lmdeploy_engine_template' from 'swift.llm'`
template_type: qwen2-vl
[INFO:swift] Downloading the model from ModelScope Hub, model_id: qwen/Qwen2-VL-2B-Instruct
[WARNING:modelscope] Using branch: master as version is unstable, use with caution
[INFO:swift] Loading the model using model_dir: /home/ps/.cache/modelscope/hub/qwen/Qwen2-VL-2B-Instruct
Unrecognized keys in `rope_scaling` for 'rope_type'='default': {'mrope_section'}
[INFO:swift] model_kwargs: {'device_map': 'auto'}
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
`Qwen2VLRotaryEmbedding` can now be fully parameterized by passing the model config through the `config` argument. All other arguments will be removed in v4.46
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.07s/it]
[INFO:swift] model.max_model_len: 32768
[INFO:swift] Global seed set to 42
[INFO:swift] Using environment variable `SIZE_FACTOR`, Setting size_factor: 8.
[INFO:swift] Setting resized_height: None. You can adjust this hyperparameter through the environment variable: `RESIZED_HEIGHT`.
[INFO:swift] Setting resized_width: None. You can adjust this hyperparameter through the environment variable: `RESIZED_WIDTH`.
[INFO:swift] Setting min_pixels: 3136. You can adjust this hyperparameter through the environment variable: `MIN_PIXELS`.
[INFO:swift] Using environment variable `MAX_PIXELS`, Setting max_pixels: 602112.
../aten/src/ATen/native/cuda/Indexing.cu:1231: indexSelectSmallIndex: block: [4,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
......
Traceback (most recent call last):
File "/home/ps/Github/AiVl/scripts/qwen2_vl_2b.py", line 24, in <module>
response, history = inference(model, template, query)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ps/Github/swift/swift/llm/utils/utils.py", line 864, in inference
generate_ids = model.generate(streamer=streamer, generation_config=generation_config, **inputs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/transformers/generation/utils.py", line 2053, in generate
result = self._sample(
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/transformers/generation/utils.py", line 3003, in _sample
outputs = self(**model_inputs, return_dict=True)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1680, in forward
inputs_embeds = self.model.embed_tokens(input_ids)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1603, in _call_impl
result = forward_call(*args, **kwargs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 164, in forward
return F.embedding(
File "/home/ps/.virtualenvs/aivl/lib/python3.10/site-packages/torch/nn/functional.py", line 2267, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
参考 https://github.com/modelscope/ms-swift/blob/main/docs/source/Multi-Modal/qwen2-vl%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5.md,在 四个16G V100 显卡主机上,搭建环境,测试单样本推理脚本时发现,仅单卡时可以正常运行。双卡,三卡和四卡时运行异常。
搭建环境
测试脚本
qwen2_vl_2b.py
单卡测试结果
将测试脚本中
os.environ['CUDA_VISIBLE_DEVICES']
设置为 0。双卡测试结果
将测试脚本中
os.environ['CUDA_VISIBLE_DEVICES']
设置为 0,1。三卡和四卡测试结果
将测试脚本中
os.environ['CUDA_VISIBLE_DEVICES']
分别设置为 0,1,2 和 0,1,2,3。针对这个问题,有什么解决方案么?
参考连接