hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
https://arxiv.org/abs/2403.13372
Apache License 2.0
34.57k stars 4.26k forks source link

LLaMa-factory 部署Llama-3.2-11B-Vision-Instruct 运行报错 #5549

Closed caijx168 closed 1 month ago

caijx168 commented 1 month ago

Reminder

System Info

Reproduction

运行命令如下: CUDA_VISIBLE_DEVICES=0 API_PORT=8005 nohup python src/api.py \ --model_name_or_path /home/Llama-3.2/Llama-3.2-11B-Vision-Instruct \ --template llama3\ --infer_backend vllm \ --vllm_maxlen 8000 \ --vllm_gpu_util 0.8 \ --vllm_enforce_eager true &

运行报错如下:[INFO|configuration_utils.py:731] 2024-09-26 13:55:31,084 >> loading configuration file /home/Llama-3.2/Llama-3.2-11B-Vision-Instruct/config.json Traceback (most recent call last): File "/root/anaconda3/envs/LLaMA-Factory-main/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 982, in from_pretrained config_class = CONFIG_MAPPING[config_dict["model_type"]] File "/root/anaconda3/envs/LLaMA-Factory-main/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 684, in getitem raise KeyError(key) KeyError: 'mllama'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/LLaMA-Factory-main/src/api.py", line 33, in main() File "/home/LLaMA-Factory-main/src/api.py", line 24, in main chat_model = ChatModel() File "/home/LLaMA-Factory-main/src/llamafactory/chat/chat_model.py", line 45, in init self.engine: "BaseEngine" = VllmEngine(model_args, data_args, finetuning_args, generating_args) File "/home/LLaMA-Factory-main/src/llamafactory/chat/vllm_engine.py", line 55, in init config = load_config(model_args) # may download model from ms hub File "/home/LLaMA-Factory-main/src/llamafactory/model/loader.py", line 117, in load_config return AutoConfig.from_pretrained(model_args.model_name_or_path, **init_kwargs) File "/root/anaconda3/envs/LLaMA-Factory-main/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 984, in from_pretrained raise ValueError( ValueError: The checkpoint you are trying to load has model type mllama but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

Expected behavior

正确运行Llama-3.2-11B-Vision-Instruct

Others

No response

gudo7208 commented 1 month ago

"Transformers version: 4.42.3" The Transformers version is too low. You should install the latest development version(4.46.0dev maybe) from GitHub to make it work. However, I directly ran the official example code from Meta on Hugging Face. Llama Factory might need to wait for an update.

caijx168 commented 1 month ago

目前看到支持 以下LLAMA-3.2的模型,register_model_group( models={ "LLaMA3.2-1B": { DownloadSource.DEFAULT: "meta-llama/Llama-3.2-1B", DownloadSource.MODELSCOPE: "LLM-Research/Llama-3.2-1B", }, "LLaMA3.2-3B": { DownloadSource.DEFAULT: "meta-llama/Llama-3.2-3B", DownloadSource.MODELSCOPE: "LLM-Research/Llama-3.2-3B", }, "LLaMA3.2-1B-Instruct": { DownloadSource.DEFAULT: "meta-llama/Llama-3.2-1B-Instruct", DownloadSource.MODELSCOPE: "LLM-Research/Llama-3.2-1B-Instruct", }, "LLaMA3.2-3B-Instruct": { DownloadSource.DEFAULT: "meta-llama/Llama-3.2-3B-Instruct", DownloadSource.MODELSCOPE: "LLM-Research/Llama-3.2-3B-Instruct", }, }, template="llama3", )看来是还没有支持11B模型,希望尽快更新LLaMA-Factory

caijx168 commented 1 month ago

"Transformers version: 4.42.3" The Transformers version is too low. You should install the latest development version(4.46.0dev maybe) from GitHub to make it work. However, I directly ran the official example code from Meta on Hugging Face. Llama Factory might need to wait for an update.

目前看到支持 以下LLAMA-3.2的模型,register_model_group( models={ "LLaMA3.2-1B": { DownloadSource.DEFAULT: "meta-llama/Llama-3.2-1B", DownloadSource.MODELSCOPE: "LLM-Research/Llama-3.2-1B", }, "LLaMA3.2-3B": { DownloadSource.DEFAULT: "meta-llama/Llama-3.2-3B", DownloadSource.MODELSCOPE: "LLM-Research/Llama-3.2-3B", }, "LLaMA3.2-1B-Instruct": { DownloadSource.DEFAULT: "meta-llama/Llama-3.2-1B-Instruct", DownloadSource.MODELSCOPE: "LLM-Research/Llama-3.2-1B-Instruct", }, "LLaMA3.2-3B-Instruct": { DownloadSource.DEFAULT: "meta-llama/Llama-3.2-3B-Instruct", DownloadSource.MODELSCOPE: "LLM-Research/Llama-3.2-3B-Instruct", }, }, template="llama3", )看来是还没有支持11B模型,希望尽快更新LLaMA-Factory

gudo7208 commented 1 month ago

不是,这只是更新了这些模型的下载资源,你要运行起来 3.2 还是要按照官方的要求更新最新的Transformers

caijx168 commented 1 month ago

不是,这只是更新了这些模型的下载资源,你要运行起来 3.2 还是要按照官方的要求更新最新的Transformers

好的,我试试看,谢谢呀

marko1616 commented 1 month ago

是这样的没有支持,我这里有个pr你们可以试试,所有库的版本请确保是最新的,我在a100完成了训练与推理的测试,但是b&b的8bit量化还是有问题4bit就行。目测应该是b&b的问题到时候会解决一下

caijx168 commented 1 month ago

是这样的没有支持,我这里有个pr你们可以试试,所有库的版本请确保是最新的,我在a100完成了训练与推理的测试,但是b&b的8bit量化还是有问题4bit就行。目测应该是b&b的问题到时候会解决一下

那LLaMA-Factory什么时候会更新支持Llama-3.2-11B-Vision-Instruct的版本呢?预期是多久呢?

marko1616 commented 1 month ago

是这样的没有支持,我这里有个pr你们可以试试,所有库的版本请确保是最新的,我在a100完成了训练与推理的测试,但是b&b的8bit量化还是有问题4bit就行。目测应该是b&b的问题到时候会解决一下

那LLaMA-Factory什么时候会更新支持Llama-3.2-11B-Vision-Instruct的版本呢?预期是多久呢?

instruct是可以的在我内个分支,只是没处理图像应该关注哪些token的问题因为涉及到batch处理我又时间不多。

caijx168 commented 1 month ago

是这样的没有支持,我这里有个pr你们可以试试,所有库的版本请确保是最新的,我在a100完成了训练与推理的测试,但是b&b的8bit量化还是有问题4bit就行。目测应该是b&b的问题到时候会解决一下

那LLaMA-Factory什么时候会更新支持Llama-3.2-11B-Vision-Instruct的版本呢?预期是多久呢?

instruct是可以的在我内个分支,只是没处理图像应该关注哪些token的问题因为涉及到batch处理我又时间不多。

你好,是下载这个分支的代码嘛https://github.com/marko1616/LLaMA-Factory/tree/feat/llama3.2vl 来运行嘛

marko1616 commented 1 month ago

是这样的

caijx168 commented 1 month ago

是这样的没有支持,我这里有个pr你们可以试试,所有库的版本请确保是最新的,我在a100完成了训练与推理的测试,但是b&b的8bit量化还是有问题4bit就行。目测应该是b&b的问题到时候会解决一下

那LLaMA-Factory什么时候会更新支持Llama-3.2-11B-Vision-Instruct的版本呢?预期是多久呢?

instruct是可以的在我内个分支,只是没处理图像应该关注哪些token的问题因为涉及到batch处理我又时间不多。

下载这个分支运行会报错如下:Traceback (most recent call last): File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/api.py", line 33, in main() File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/api.py", line 24, in main chat_model = ChatModel() ^^^^^^^^^^^ File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/chat/chat_model.py", line 54, in init self.engine: "BaseEngine" = VllmEngine(model_args, data_args, finetuning_args, generating_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/chat/vllm_engine.py", line 93, in init self.model = AsyncLLMEngine.from_engine_args(AsyncEngineArgs(**engine_args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 362, in from_engine_args engine_config = engine_args.create_engine_config() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/engine/arg_utils.py", line 559, in create_engine_config model_config = ModelConfig( ^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/config.py", line 133, in init self.max_model_len = _get_and_verify_max_len( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/config.py", line 1216, in _get_and_verify_max_len if rope_scaling is not None and rope_scaling["type"] != "su":


KeyError: 'type'
这个是什么问题引起的呢?
系统环境(LLaMA-Factory-feat-llama3.2vl) root@root1-System-Product-Name:/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl#  llamafactory-cli env 

- `llamafactory` version: 0.9.1.dev0
- Platform: Linux-6.8.0-40-generic-x86_64-with-glibc2.35
- Python version: 3.11.10
- PyTorch version: 2.3.0+cu121 (GPU)
- Transformers version: 4.45.0
- Datasets version: 2.21.0
- Accelerate version: 0.34.2
- PEFT version: 0.12.0
- TRL version: 0.9.6
- GPU type: NVIDIA GeForce RTX 4090 D
- vLLM version: 0.4.3
hiyouga commented 1 month ago

@caijx168 update transformers

caijx168 commented 1 month ago

@caijx168 update transformers

我升级之后报错如下(base) root@root1-System-Product-Name:/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl# tail -f nohup.out File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/engine/arg_utils.py", line 559, in create_engine_config model_config = ModelConfig( ^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/config.py", line 133, in init self.max_model_len = _get_and_verify_max_len( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/config.py", line 1216, in _get_and_verify_max_len if rope_scaling is not None and rope_scaling["type"] != "su":


KeyError: 'type'
Traceback (most recent call last):
  File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/api.py", line 19, in <module>
    from llamafactory.api.app import create_app
  File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/api/app.py", line 23, in <module>
    from ..chat import ChatModel
  File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/chat/__init__.py", line 16, in <module>
    from .chat_model import ChatModel
  File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/chat/chat_model.py", line 24, in <module>
    from ..hparams import get_infer_args
  File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/hparams/__init__.py", line 20, in <module>
    from .parser import get_eval_args, get_infer_args, get_train_args
  File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/hparams/parser.py", line 45, in <module>
    check_dependencies()
  File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/extras/misc.py", line 82, in check_dependencies
    require_version("transformers>=4.41.2,<=4.45.0", "To fix: pip install transformers>=4.41.2,<=4.45.0")
  File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/transformers/utils/versions.py", line 111, in require_version
    _compare_versions(op, got_ver, want_ver, requirement, pkg, hint)
  File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/transformers/utils/versions.py", line 44, in _compare_versions
    raise ImportError(
ImportError: transformers>=4.41.2,<=4.45.0 is required for a normal functioning of this module, but found transformers==4.45.2.
To fix: pip install transformers>=4.41.2,<=4.45.0
hiyouga commented 1 month ago

@caijx168 update llamafactory

caijx168 commented 1 month ago

https://github.com/marko1616/LLaMA-Factory/tree/feat/llama3.2vl

@caijx168 update llamafactory

我下载的这个分支就是最新的呢https://github.com/marko1616/LLaMA-Factory/tree/feat/llama3.2vl

marko1616 commented 1 month ago

https://github.com/marko1616/LLaMA-Factory/tree/feat/llama3.2vl

@caijx168 update llamafactory

我下载的这个分支就是最新的呢https://github.com/marko1616/LLaMA-Factory/tree/feat/llama3.2vl

改改代码把src/llamafactory/extras/misc.py的transformers版本改掉,或者降低版本

caijx168 commented 1 month ago

https://github.com/marko1616/LLaMA-Factory/tree/feat/llama3.2vl

@caijx168 update llamafactory

我下载的这个分支就是最新的呢https://github.com/marko1616/LLaMA-Factory/tree/feat/llama3.2vl

改改代码把src/llamafactory/extras/misc.py的transformers版本改掉,或者降低版本

transformers 版本降为4.45.0 运行会报这个错误 File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/api.py", line 33, in main() File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/api.py", line 24, in main chat_model = ChatModel() ^^^^^^^^^^^ File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/chat/chat_model.py", line 54, in init self.engine: "BaseEngine" = VllmEngine(model_args, data_args, finetuning_args, generating_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/LLaMA-Factory/LLaMA-Factory-feat-llama3.2vl/src/llamafactory/chat/vllm_engine.py", line 93, in init self.model = AsyncLLMEngine.from_engine_args(AsyncEngineArgs(**engine_args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 362, in from_engine_args engine_config = engine_args.create_engine_config() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/engine/arg_utils.py", line 559, in create_engine_config model_config = ModelConfig( ^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/config.py", line 133, in init self.max_model_len = _get_and_verify_max_len( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/anaconda3/envs/LLaMA-Factory-feat-llama3.2vl/lib/python3.11/site-packages/vllm/config.py", line 1216, in _get_and_verify_max_len if rope_scaling is not None and rope_scaling["type"] != "su":


KeyError: 'type'
marko1616 commented 1 month ago

不支持vllm后端哦

caijx168 commented 1 month ago

不支持vllm后端哦 那要如何部署 OpenAI API

caijx168 commented 1 month ago

不支持vllm后端哦 那要如何部署 OpenAI API

我之前都是用这个来部署的利用 vLLM 部署 OpenAI API API_PORT=8000 llamafactory-cli api examples/inference/llama3_vllm.yaml

Slipstream-Max commented 2 weeks ago

把vllm去掉就行 类似于 model_name_or_path: meta-llama/Meta-Llama-3-8B-Instruct template: llama3 这样