InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
https://lmdeploy.readthedocs.io/en/latest/
Apache License 2.0
4.11k stars 373 forks source link

[Bug] Error deploying HuggingFace model llava-v1.6-mistral-7b with lmdeploy: Unrecognized model type llava_mistral #1573

Closed enogalaxies closed 3 months ago

enogalaxies commented 3 months ago

Checklist

Describe the bug

When attempting to deploy the llava-v1.6-mistral-7b model from HuggingFace using lmdeploy, I encountered an error indicating that the model type llava_mistral is not recognized by the Transformers library. This issue occurred despite the lmdeploy documentation stating that any model in HuggingFace format should be supported for inference. The same process worked previously for a model of type vicuna, but is failing for mistral.Local inference with the model works correctly.

Reproduction

Attempt to deploy the llava-v1.6-mistral-7b model using lmdeploy. Observe the error message in the logs.

Environment

sys.platform: linux
Python: 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0]
CUDA available: True
MUSA available: False
numpy_random_seed: 2147483648
GPU 0,1: Tesla T4
CUDA_HOME: /usr/local/cuda-11.7:/usr/local/cuda-11.6
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
PyTorch: 2.1.2+cu121
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 12.1
  - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90
  - CuDNN 8.6  (built against CUDA 11.8)
    - Built with CuDNN 8.9.2
  - Magma 2.6.1
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.2, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.16.2+cu121
LMDeploy: 0.3.0+
transformers: 4.40.1
gradio: 4.16.0
fastapi: 0.110.1
pydantic: 2.6.4
triton: 2.1.0

Error traceback

2024-05-09 11:58:07,078 - lmdeploy - ERROR - ValueError: The checkpoint you are trying to load has model type `llava_mistral` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
2024-05-09 11:58:07,078 - lmdeploy - ERROR - <transformers> test failed!
Load model config with transformers==4.40.1 failed. Please make sure model can be loaded with transformers API.
irexyc commented 3 months ago

Currently, only turbomind backend support vision language model, and turbomind backend doesn't support moe architecture, so you can't deploy mistral model.

https://github.com/InternLM/lmdeploy/blob/main/docs/en/supported_models/supported_models.md#models-supported-by-turbomind

enogalaxies commented 3 months ago

Sorry, we are not using moe architecture mistral,we use llava-v1.6-mistral-7b in hugging face link: https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b

enogalaxies commented 3 months ago

when i use pytorch backend the same error occur blew:

lmdeploy serve api_server /data/kai.qiao/model_repo/llava/liuhaotian/llava-v1.6-mistral-7b --server-port 3333 --backend pytorch --model-name mistral

ValueError: The checkpoint you are trying to load has model type llava_mistral but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date. 2024-05-10 17:58:49,668 - lmdeploy - ERROR - test failed! Load model config with transformers==4.40.1 failed. Please make sure model can be loaded with transformers API.

irexyc commented 3 months ago

Currently, we only support llava llama. To support llava-mistral, the LlavaMistralConfig and LlavaMistralForCausalLM should be used here, but I am not sure the if modification is enough. (including the https://github.com/InternLM/lmdeploy/pull/1579 modification)

We will check and support the model later if it doesn't use the moe or windows attention that turbomind backend doesn't support currently.

enogalaxies commented 3 months ago

Thank you very much for your patience

lvhan028 commented 3 months ago

1579 works for llava_mistral now. May give it a try.

enogalaxies commented 3 months ago

i compile src and execute always occur error ,like blew

python mistra_7b.py Fetching 15 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 194180.74it/s] 2024-05-13 18:55:03,879 - lmdeploy - INFO - Using turbomind engine Traceback (most recent call last): File "/home/kai.qiao/.local/lib/python3.8/site-packages/lmdeploy/vl/model/llava.py", line 21, in check_llava_install import llava # noqa: F401 File "/home/kai.qiao/.local/lib/python3.8/site-packages/llava/init.py", line 1, in from .model import LlavaLlamaForCausalLM ImportError: cannot import name 'LlavaLlamaForCausalLM' from 'llava.model' (/home/kai.qiao/.local/lib/python3.8/site-packages/llava/model/init.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "mistra_7b.py", line 4, in pipe = pipeline('liuhaotian/llava-v1.6-mistral-7b',model_name='llava_v1.6_mistral_7b', File "/home/kai.qiao/.local/lib/python3.8/site-packages/lmdeploy/api.py", line 94, in pipeline return pipeline_class(model_path, File "/home/kai.qiao/.local/lib/python3.8/site-packages/lmdeploy/serve/vl_async_engine.py", line 16, in init self.vl_encoder = ImageEncoder(model_path) File "/home/kai.qiao/.local/lib/python3.8/site-packages/lmdeploy/vl/engine.py", line 68, in init self.model = load_vl_model(model_path) File "/home/kai.qiao/.local/lib/python3.8/site-packages/lmdeploy/vl/model/builder.py", line 36, in load_vl_model return LlavaVisionModel(model_path, arch=arch) File "/home/kai.qiao/.local/lib/python3.8/site-packages/lmdeploy/vl/model/llava.py", line 66, in init self.build_model() File "/home/kai.qiao/.local/lib/python3.8/site-packages/lmdeploy/vl/model/llava.py", line 71, in build_model check_llava_install() File "/home/kai.qiao/.local/lib/python3.8/site-packages/lmdeploy/vl/model/llava.py", line 23, in check_llava_install raise ImportError( ImportError: To use LlavaVLModel, please install llava by pip install git+https://github.com/haotian-liu/LLaVA.git --no-deps

lvhan028 commented 3 months ago

ImportError: To use LlavaVLModel, please install llava by pip install git+https://github.com/haotian-liu/LLaVA.git --no-deps As guided by the log, pip install llava please

enogalaxies commented 3 months ago

after install the error the same

github-actions[bot] commented 3 months ago

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

github-actions[bot] commented 3 months ago

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.