modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
https://swift.readthedocs.io/zh-cn/latest/Instruction/index.html
Apache License 2.0
4.35k stars 383 forks source link

ValueError: model_type: 'internvl2-8b-awq' is not registered. #1847

Closed radna0 closed 2 months ago

radna0 commented 2 months ago

Describe the bug What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)

swift infer --model_type internvl2-8b-awq --infer_backend lmdeploy
WARNING:root:libtpu.so and TPU device found. Setting PJRT_DEVICE=TPU.
run sh: `/usr/bin/python3.10 /home/kojoe/swift/swift/cli/infer.py --model_type internvl2-8b-awq --infer_backend lmdeploy`
/home/kojoe/.local/lib/python3.10/site-packages/torchvision/io/image.py:14: UserWarning: Failed to load image Python extension: 'libc10_cuda.so: cannot open shared object file: No such file or directory'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
[INFO:swift] Successfully registered `/home/kojoe/swift/swift/llm/data/dataset_info.json`
[INFO:swift] No vLLM installed, if you are using vLLM, you will get `ImportError: cannot import name 'get_vllm_engine' from 'swift.llm'`
[INFO:swift] No LMDeploy installed, if you are using LMDeploy, you will get `ImportError: cannot import name 'prepare_lmdeploy_engine_template' from 'swift.llm'`
[INFO:swift] Start time of running main: 2024-08-29 08:35:57.968721
[INFO:swift] ckpt_dir: None
[INFO:swift] Due to `ckpt_dir` being `None`, `load_args_from_ckpt_dir` is set to `False`.
Traceback (most recent call last):
  File "/home/kojoe/swift/swift/cli/infer.py", line 5, in <module>
    infer_main()
  File "/home/kojoe/swift/swift/utils/run_utils.py", line 22, in x_main
    args, remaining_argv = parse_args(args_class, argv)
  File "/home/kojoe/swift/swift/utils/utils.py", line 131, in parse_args
    args, remaining_args = parser.parse_args_into_dataclasses(argv, return_remaining_strings=True)
  File "/usr/local/lib/python3.10/dist-packages/transformers/hf_argparser.py", line 339, in parse_args_into_dataclasses
    obj = dtype(**inputs)
  File "<string>", line 83, in __init__
  File "/home/kojoe/swift/swift/llm/utils/argument.py", line 1412, in __post_init__
    self.set_model_type()
  File "/home/kojoe/swift/swift/llm/utils/argument.py", line 485, in set_model_type
    raise ValueError(f"model_type: '{self.model_type}' is not registered. " + error_msg)
ValueError: model_type: 'internvl2-8b-awq' is not registered. The model_type you can choose: ['c4ai-command-r-plus', 'c4ai-command-r-v01', 'baichuan-7b', 'baichuan-13b-chat', 'xverse-moe-a4_2b', 'xverse-7b', 'xverse-7b-chat', 'xverse-13b-256k', 'xverse-65b-chat', 'xverse-65b-v2', 'xverse-65b', 'xverse-13b', 'xverse-13b-chat', 'seqgpt-560m', 'bluelm-7b', 'bluelm-7b-32k', 'bluelm-7b-chat', 'bluelm-7b-chat-32k', 'internlm-7b', 'internlm-20b', 'cogvlm2-19b-chat', 'cogvlm2-en-19b-chat', 'cogvlm2-video-13b-chat', 'llava-llama-3-8b-v1_1', 'grok-1', 'mamba-2.8b', 'mamba-1.4b', 'mamba-790m', 'mamba-390m', 'mamba-370m', 'mamba-130m', 'cogagent-18b-instruct', 'cogagent-18b-chat', 'cogvlm-17b-chat', 'internlm-7b-chat', 'internlm-7b-chat-8k', 'internlm-20b-chat', 'baichuan-13b', 'paligemma-3b-mix-448', 'paligemma-3b-mix-224', 'paligemma-3b-pt-896', 'paligemma-3b-pt-448', 'paligemma-3b-pt-224', 'phi3_5-vision-instruct', 'phi3-vision-128k-instruct', 'baichuan2-13b', 'baichuan2-13b-chat', 'baichuan2-7b', 'baichuan2-7b-chat', 'baichuan2-7b-chat-int4', 'baichuan2-13b-chat-int4', 'codegeex2-6b', 'chatglm2-6b', 'chatglm2-6b-32k', 'chatglm3-6b-base', 'chatglm3-6b', 'chatglm3-6b-128k', 'chatglm3-6b-32k', 'codefuse-codegeex2-6b-chat', 'glm4-9b-chat-1m', 'glm4-9b-chat', 'glm4-9b', 'codegeex4-9b-chat', 'longwriter-glm4-9b', 'glm4v-9b-chat', 'dbrx-instruct', 'dbrx-base', 'mistral-nemo-base-2407', 'mistral-nemo-instruct-2407', 'mistral-large-instruct-2407', 'mixtral-moe-8x22b-v1', 'mixtral-moe-7b-instruct', 'mixtral-moe-7b', 'mistral-7b-v2', 'codestral-22b', 'mistral-7b', 'mistral-7b-instruct-v3', 'mistral-7b-instruct-v2', 'mistral-7b-instruct', 'openbuddy-llama2-13b-chat', 'openbuddy-llama3-8b-chat', 'openbuddy-llama3-70b-chat', 'openbuddy-llama-65b-chat', 'openbuddy-llama2-70b-chat', 'openbuddy-mistral-7b-chat', 'openbuddy-mixtral-moe-7b-chat', 'ziya2-13b', 'ziya2-13b-chat', 'yi-6b', 'yi-9b-200k', 'yi-9b', 'yi-6b-200k', 'yi-34b', 'yi-34b-200k', 'yi-34b-chat-int8', 'yi-34b-chat-awq', 'yi-34b-chat', 'yi-6b-chat-int8', 'yi-6b-chat-awq', 'yi-6b-chat', 'zephyr-7b-beta-chat', 'openbuddy-zephyr-7b-chat', 'sus-34b-chat', 'deepseek-7b', 'deepseek-7b-chat', 'deepseek-67b', 'deepseek-67b-chat', 'openbuddy-deepseek-67b-chat', 'deepseek-coder-33b-instruct', 'deepseek-coder-6_7b-instruct', 'deepseek-coder-1_3b-instruct', 'deepseek-coder-33b', 'deepseek-coder-6_7b', 'deepseek-coder-1_3b', 'qwen1half-moe-a2_7b', 'codeqwen1half-7b', 'qwen1half-110b', 'qwen1half-72b', 'qwen1half-32b', 'qwen1half-14b', 'qwen1half-7b', 'qwen1half-4b', 'qwen1half-1_8b', 'qwen1half-0_5b', 'deepseek-math-7b', 'deepseek-math-7b-chat', 'numina-math-7b', 'deepseek-math-7b-instruct', 'gemma-7b-instruct', 'gemma-2b-instruct', 'gemma-7b', 'gemma-2b', 'wizardlm2-7b-awq', 'wizardlm2-8x22b', 'phi3_5-mini-instruct', 'phi3_5-moe-instruct', 'phi3-4b-4k-instruct', 'phi3-medium-128k-instruct', 'phi3-medium-4k-instruct', 'phi3-4b-128k-instruct', 'minicpm-2b-128k', 'minicpm-1b-sft-chat', 'minicpm-2b-chat', 'minicpm-2b-sft-chat', 'qwen2-72b', 'qwen2-7b', 'qwen2-1_5b', 'qwen2-0_5b', 'qwen2-57b-a14b', 'gemma2-27b-instruct', 'gemma2-9b-instruct', 'gemma2-2b-instruct', 'gemma2-27b', 'gemma2-9b', 'gemma2-2b', 'yi-1_5-34b-chat-16k', 'yi-1_5-34b-chat', 'yi-1_5-34b', 'yi-1_5-9b-chat-16k', 'yi-1_5-9b-chat', 'yi-1_5-9b', 'yi-1_5-34b-chat-gptq-int4', 'yi-1_5-34b-chat-awq-int4', 'yi-1_5-9b-chat-gptq-int4', 'yi-1_5-9b-chat-awq-int4', 'yi-1_5-6b-chat-gptq-int4', 'yi-1_5-6b-chat-awq-int4', 'yi-1_5-6b-chat', 'yi-1_5-6b', 'florence-2-large-ft', 'florence-2-large', 'florence-2-base-ft', 'florence-2-base', 'phi3-small-128k-instruct', 'phi3-small-8k-instruct', 'codeqwen1half-7b-chat', 'qwen1half-moe-a2_7b-chat', 'qwen1half-110b-chat', 'qwen1half-72b-chat', 'qwen1half-32b-chat', 'qwen1half-14b-chat', 'qwen1half-7b-chat', 'qwen1half-4b-chat', 'qwen1half-1_8b-chat', 'qwen1half-0_5b-chat', 'codeqwen1half-7b-chat-awq', 'qwen1half-110b-chat-awq', 'qwen1half-72b-chat-awq', 'qwen1half-32b-chat-awq', 'qwen1half-14b-chat-awq', 'qwen1half-7b-chat-awq', 'qwen1half-4b-chat-awq', 'qwen1half-1_8b-chat-awq', 'qwen1half-0_5b-chat-awq', 'qwen2-72b-instruct', 'qwen2-7b-instruct', 'qwen2-1_5b-instruct', 'qwen2-0_5b-instruct', 'qwen2-72b-instruct-awq', 'qwen2-7b-instruct-awq', 'qwen2-1_5b-instruct-awq', 'qwen2-0_5b-instruct-awq', 'qwen2-72b-instruct-int8', 'qwen2-72b-instruct-int4', 'qwen2-7b-instruct-int8', 'qwen2-7b-instruct-int4', 'qwen2-1_5b-instruct-int8', 'qwen2-1_5b-instruct-int4', 'qwen2-0_5b-instruct-int8', 'qwen2-0_5b-instruct-int4', 'qwen2-57b-a14b-instruct', 'qwen2-57b-a14b-instruct-int4', 'qwen2-math-72b', 'qwen2-math-72b-instruct', 'qwen2-math-7b', 'qwen2-math-7b-instruct', 'qwen2-math-1_5b', 'qwen2-math-1_5b-instruct', 'qwen2-audio-7b', 'qwen2-audio-7b-instruct', 'qwen1half-moe-a2_7b-chat-int4', 'qwen1half-72b-chat-int8', 'qwen1half-110b-chat-int4', 'qwen1half-72b-chat-int4', 'qwen1half-32b-chat-int4', 'qwen1half-14b-chat-int8', 'qwen1half-14b-chat-int4', 'qwen1half-7b-chat-int8', 'qwen1half-7b-chat-int4', 'qwen1half-4b-chat-int8', 'qwen1half-4b-chat-int4', 'qwen1half-1_8b-chat-int8', 'qwen1half-1_8b-chat-int4', 'qwen1half-0_5b-chat-int8', 'qwen1half-0_5b-chat-int4', 'internlm2-20b-base', 'internlm2-20b', 'internlm2-7b-base', 'internlm2-7b', 'internlm2-20b-chat', 'internlm2-20b-sft-chat', 'internlm2-7b-chat', 'internlm2-7b-sft-chat', 'internlm2-math-20b-chat', 'internlm2-math-7b-chat', 'internlm2-math-20b', 'internlm2-math-7b', 'internlm2-1_8b-chat', 'internlm2-1_8b-sft-chat', 'internlm2-1_8b', 'internlm2_5-20b-chat', 'internlm2_5-20b', 'internlm2_5-7b-chat-1m', 'internlm2_5-7b-chat', 'internlm2_5-7b', 'internlm2_5-1_8b-chat', 'internlm2_5-1_8b', 'deepseek-v2-chat', 'deepseek-v2', 'deepseek-v2-lite-chat', 'deepseek-v2-lite', 'deepseek-coder-v2-lite-instruct', 'deepseek-coder-v2-instruct', 'deepseek-coder-v2-lite', 'deepseek-coder-v2', 'internvl2-llama3-76b', 'internvl2-40b', 'internvl2-26b', 'internvl2-8b', 'internvl2-4b', 'internvl2-2b', 'internvl2-1b', 'mini-internvl-chat-4b-v1_5', 'mini-internvl-chat-2b-v1_5', 'internvl-chat-v1_5-int8', 'internvl-chat-v1_5', 'internlm-xcomposer2-4khd-7b-chat', 'internlm-xcomposer2-7b-chat', 'internlm-xcomposer2_5-7b-chat', 'deepseek-vl-1_3b-chat', 'deepseek-vl-7b-chat', 'longwriter-llama3_1-8b', 'mengzi3-13b-base', 'atom-7b-chat', 'atom-7b', 'chinese-alpaca-2-13b-16k', 'chinese-alpaca-2-13b', 'chinese-alpaca-2-7b-64k', 'chinese-alpaca-2-7b-16k', 'chinese-alpaca-2-7b', 'chinese-alpaca-2-1_3b', 'chinese-llama-2-13b-16k', 'chinese-llama-2-13b', 'chinese-llama-2-7b-64k', 'chinese-llama-2-7b-16k', 'chinese-llama-2-7b', 'chinese-llama-2-1_3b', 'llama2-70b-chat', 'llama2-13b-chat', 'llama2-7b-chat', 'llama2-70b', 'llama2-13b', 'llama2-7b', 'mixtral-moe-7b-aqlm-2bit-1x16', 'llama2-7b-aqlm-2bit-1x16', 'llama-3-chinese-8b-instruct', 'llama-3-chinese-8b', 'llama3-8b', 'llama3-8b-instruct', 'llama3-70b', 'llama3-70b-instruct', 'llama3-8b-instruct-int4', 'llama3-8b-instruct-int8', 'llama3-8b-instruct-awq', 'llama3-70b-instruct-int4', 'llama3-70b-instruct-int8', 'llama3-70b-instruct-awq', 'llama3_1-8b', 'llama3_1-8b-instruct', 'llama3_1-8b-instruct-awq', 'llama3_1-8b-instruct-gptq-int4', 'llama3_1-8b-instruct-bnb', 'llama3_1-70b', 'llama3_1-70b-instruct', 'llama3_1-70b-instruct-fp8', 'llama3_1-70b-instruct-awq', 'llama3_1-70b-instruct-gptq-int4', 'llama3_1-70b-instruct-bnb', 'llama3_1-405b', 'llama3_1-405b-instruct', 'llama3_1-405b-instruct-fp8', 'llama3_1-405b-instruct-awq', 'llama3_1-405b-instruct-gptq-int4', 'llama3_1-405b-instruct-bnb', 'openbuddy-llama3_1-8b-chat', 'polylm-13b', 'qwen-7b', 'qwen-14b', 'tongyi-finance-14b', 'qwen-72b', 'qwen-1_8b', 'codefuse-qwen-14b-chat', 'modelscope-agent-14b', 'modelscope-agent-7b', 'qwen-7b-chat', 'qwen-14b-chat', 'tongyi-finance-14b-chat', 'qwen-72b-chat', 'qwen-1_8b-chat', 'qwen-vl', 'qwen-vl-chat', 'qwen-audio', 'qwen-audio-chat', 'qwen-7b-chat-int4', 'qwen-14b-chat-int4', 'qwen-7b-chat-int8', 'qwen-14b-chat-int8', 'qwen-vl-chat-int4', 'tongyi-finance-14b-chat-int4', 'qwen-72b-chat-int4', 'qwen-72b-chat-int8', 'qwen-1_8b-chat-int4', 'qwen-1_8b-chat-int8', 'skywork-13b', 'skywork-13b-chat', 'codefuse-codellama-34b-chat', 'telechat-12b-v2-gptq-int4', 'telechat-12b-v2', 'telechat-12b', 'phi2-3b', 'telechat-7b', 'minicpm-moe-8x2b', 'deepseek-moe-16b', 'deepseek-moe-16b-chat', 'yuan2-m32', 'yuan2-2b-janus-instruct', 'yuan2-102b-instruct', 'yuan2-51b-instruct', 'yuan2-2b-instruct', 'orion-14b-chat', 'orion-14b', 'yi-vl-6b-chat', 'yi-vl-34b-chat', 'minicpm-v-v2-chat', 'minicpm-v-3b-chat', 'minicpm-v-v2_5-chat', 'minicpm-v-v2_6-chat', 'llava1_5-7b-instruct', 'llava1_5-13b-instruct', 'llava-onevision-qwen2-72b-ov', 'llava-onevision-qwen2-7b-ov', 'llava-onevision-qwen2-0_5b-ov', 'llava1_6-mistral-7b-instruct', 'llava1_6-vicuna-13b-instruct', 'llava1_6-vicuna-7b-instruct', 'llama3-llava-next-8b-hf', 'llava-next-110b-hf', 'llava-next-72b-hf', 'llava1_6-yi-34b-instruct', 'llava-next-video-7b-instruct', 'llava-next-video-7b-32k-instruct', 'llava-next-video-7b-dpo-instruct', 'llava-next-video-34b-instruct', 'llava-next-110b', 'llava-next-72b', 'llama3-llava-next-8b', 'idefics3-8b-llama3', 'mplug-owl2_1-chat', 'mplug-owl2-chat']

Your hardware and system info Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)

Additional context Add any other context about the problem here(在这里补充其他信息)

radna0 commented 2 months ago

@tastelikefeet Can you take a look at this?

tastelikefeet commented 2 months ago

Use main branch please

tastelikefeet commented 2 months ago

pip install git+https://github.com/modelscope/ms-swift.git should works

tastelikefeet commented 2 months ago

Is it Ok? Now I will close this issue, if there is any problem, please feel free to reply this issue and I will reopen it

radna0 commented 2 months ago

Yes @tastelikefeet the model seems to register now, The only other problem is running on TPUs/XLA Devices, which I already opened another issue