[Bug] 运行OpenGVLab/InternVL2-2B-AWQ报错：KeyError: 'language_model.model.layers.0.feed_forward.w1.weight'

jianliao commented 3 months ago

Checklist

[X] 1. I have searched related issues but cannot get the expected help.
[X] 2. The bug has not been fixed in the latest version.
[X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

> lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ

KeyError: 'language_model.model.layers.0.feed_forward.w1.weight'

如果切换Backend，能够运行但是会输出大量的log，详见此附件bug.log

Reproduction

> lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ

or

lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ --backend pytorch

Environment

OS: Ubuntu 22.04
Python: 3.12
Model OpenGVLab/InternVL2-2B-AWQ

Error traceback

Traceback (most recent call last):
  File "/home/jianliao/anaconda3/envs/lmdeploy/bin/lmdeploy", line 8, in <module>
    sys.exit(run())
             ^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/cli/entrypoint.py", line 36, in run
    args.run(args)
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/cli/serve.py", line 298, in api_server
    run_api_server(args.model_path,
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/openai/api_server.py", line 1285, in serve
    VariableInterface.async_engine = pipeline_class(
                                     ^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/vl_async_engine.py", line 24, in __init__
    super().__init__(model_path, **kwargs)
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/async_engine.py", line 190, in __init__
    self._build_turbomind(model_path=model_path,
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/serve/async_engine.py", line 235, in _build_turbomind
    self.engine = tm.TurboMind.from_pretrained(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/turbomind.py", line 340, in from_pretrained
    return cls(model_path=pretrained_model_name_or_path,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/turbomind.py", line 144, in __init__
    self.model_comm = self._from_hf(model_source=model_source,
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/turbomind.py", line 235, in _from_hf
    output_model = OUTPUT_MODELS.get(output_model_name)(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/target_model/fp.py", line 26, in __init__
    super().__init__(input_model, cfg, to_file, out_dir)
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/target_model/base.py", line 172, in 
[bug.log](https://github.com/user-attachments/files/16405853/bug.log)
__init__
    self.cfg = self.get_config(cfg)
               ^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/target_model/fp.py", line 38, in get_config
    w1, _, _ = bin.ffn(i)
               ^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/source_model/internlm2.py", line 69, in ffn
    return self._ffn(i, 'weight')
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages/lmdeploy/turbomind/deploy/source_model/internlm2.py", line 62, in _ffn
    tensor = self.params[
             ^^^^^^^^^^^^
KeyError: 'language_model.model.layers.0.feed_forward.w1.weight'

lvhan028 commented 3 months ago

The related PR #1984 #1913 haven't been merged yet.

AllentDan commented 3 months ago

What is the version of lmdepoy? @jianliao The latest lmdeploy can run the model with the default backend turbomind.

jianliao commented 3 months ago

@AllentDan I upgraded to the latest version (0.5.2.post1), but I am still encountering the same error with the following command: lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ. Here are the details of my lmdeploy version:

(lmdeploy) jianliao@jianliao-ubuntu:~$ pip show lmdeploy
Name: lmdeploy
Version: 0.5.2.post1
Summary: A toolset for compressing, deploying and serving LLM
Home-page: 
Author: OpenMMLab
Author-email: openmmlab@gmail.com
License: 
Location: /home/jianliao/anaconda3/envs/lmdeploy/lib/python3.12/site-packages
Requires: accelerate, einops, fastapi, fire, mmengine-lite, numpy, nvidia-cublas-cu12, nvidia-cuda-runtime-cu12, nvidia-curand-cu12, nvidia-nccl-cu12, peft, pillow, protobuf, pydantic, pynvml, safetensors, sentencepiece, shortuuid, tiktoken, torch, torchvision, transformers, triton, uvicorn
Required-by:

AllentDan commented 3 months ago

Can you try adding --model-format awq?

jianliao commented 3 months ago

@AllentDan @lvhan028 The issue has been resolved after applying the --model-format awq option. Thanks Bro.

InternLM / lmdeploy