QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Apache License 2.0
3.37k stars 209 forks source link

升级accelerate后运行demo报错 #30

Closed arashStone closed 2 months ago

arashStone commented 3 months ago

尝试运行hugging face上的demo,提示需要升级accelerate>=0.26.0,但升级后再次运行demo会报错:

Traceback (most recent call last): File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1659, in _get_module return importlib.import_module("." + module_name, self.name) File "/usr/lib64/python3.10/importlib/init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 42, in from ..integrations.deepspeed import is_deepspeed_zero3_enabled File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/integrations/deepspeed.py", line 52, in from accelerate.utils.deepspeed import HfDeepSpeedConfig as DeepSpeedConfig File "/usr/local/app/.local/lib/python3.10/site-packages/accelerate/init.py", line 3, in from .accelerator import Accelerator File "/usr/local/app/.local/lib/python3.10/site-packages/accelerate/accelerator.py", line 35, in from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state File "/usr/local/app/.local/lib/python3.10/site-packages/accelerate/checkpointing.py", line 24, in from .utils import ( File "/usr/local/app/.local/lib/python3.10/site-packages/accelerate/utils/init.py", line 153, in from .launch import ( File "/usr/local/app/.local/lib/python3.10/site-packages/accelerate/utils/launch.py", line 33, in from ..utils.other import is_port_in_use, merge_dicts File "/usr/local/app/.local/lib/python3.10/site-packages/accelerate/utils/other.py", line 36, in from .transformer_engine import convert_model File "/usr/local/app/.local/lib/python3.10/site-packages/accelerate/utils/transformer_engine.py", line 21, in import transformer_engine.pytorch as te File "/usr/local/lib64/python3.10/site-packages/transformer_engine/pytorch/init.py", line 6, in from .module import LayerNormLinear File "/usr/local/lib64/python3.10/site-packages/transformer_engine/pytorch/module/init.py", line 6, in from .layernorm_linear import LayerNormLinear File "/usr/local/lib64/python3.10/site-packages/transformer_engine/pytorch/module/layernorm_linear.py", line 15, in from .. import cpp_extensions as tex File "/usr/local/lib64/python3.10/site-packages/transformer_engine/pytorch/cpp_extensions/init.py", line 6, in from transformer_engine_extensions import * ImportError: /usr/local/lib64/python3.10/site-packages/transformer_engine_extensions.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1659, in _get_module return importlib.import_module("." + module_name, self.name) File "/usr/lib64/python3.10/importlib/init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 41, in from ...modeling_utils import PreTrainedModel File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 46, in from .generation import GenerationConfig, GenerationMixin File "", line 1075, in _handle_fromlist File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1649, in getattr module = self._get_module(self._class_to_module[name]) File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1661, in _get_module raise RuntimeError( RuntimeError: Failed to import transformers.generation.utils because of the following error (look up to see its traceback): /usr/local/lib64/python3.10/site-packages/transformer_engine_extensions.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/usr/local/app/aigc/model_repo/test.py", line 2, in from transformers import AutoProcessor, Qwen2VLForConditionalGeneration File "", line 1075, in _handle_fromlist File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1650, in getattr value = getattr(module, name) File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1649, in getattr module = self._get_module(self._class_to_module[name]) File "/usr/local/app/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1661, in _get_module raise RuntimeError( RuntimeError: Failed to import transformers.models.qwen2_vl.modeling_qwen2_vl because of the following error (look up to see its traceback): Failed to import transformers.generation.utils because of the following error (look up to see its traceback): /usr/local/lib64/python3.10/site-packages/transformer_engine_extensions.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE

Mrzhang1999 commented 3 months ago

pip uninstall flash-attn,或者更新flash-attn

arashStone commented 3 months ago

pip uninstall flash-attn,或者更新flash-attn

上述报错的时候,已按照requirements将flash-attn更新至2.6.1。 我尝试将flash-attn升级至最新版的2.6.3或者pip uninstall flash-attn都没有效果,还是会出现上述报错。

将accelerate降级至0.25.0之后,就不会出现上述的报错了,但是在加载模型的时候没法使用device_map等参数