Finetuned Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4 cannot be loaded with PEFT

Describe the bug What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程，最好有截图) I tried finetuning the Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4 model with swift sft, and the model can be successfully loaded for inference using swift infer. However, when loading the model by PEFT, it raises the error:

ValueError: Target module Qwen2VLModel(
  (embed_tokens): Embedding(151936, 1536)
  (layers): ModuleList(
    (0-27): 28 x Qwen2VLDecoderLayer(
      (self_attn): Qwen2VLSdpaAttention(
        (rotary_emb): Qwen2VLRotaryEmbedding()
        (k_proj): QuantLinear()
        (o_proj): QuantLinear()
        (q_proj): QuantLinear()
        (v_proj): QuantLinear()
      )
      (mlp): Qwen2MLP(
        (act_fn): SiLU()
        (down_proj): QuantLinear()
        (gate_proj): QuantLinear()
        (up_proj): QuantLinear()
      )
      (input_layernorm): Qwen2RMSNorm((1536,), eps=1e-06)
      (post_attention_layernorm): Qwen2RMSNorm((1536,), eps=1e-06)
    )
  )
  (norm): Qwen2RMSNorm((1536,), eps=1e-06)
  (rotary_emb): Qwen2VLRotaryEmbedding()
) is not supported. Currently, only the following modules are supported: `torch.nn.Linear`, `torch.nn.Embedding`, `torch.nn.Conv2d`, `transformers.pytorch_utils.Conv1D`.

In the meantime, using swift export with merge_lora = True also won't help, as merge_lora cannot be enabled when the base model is quantized. Is there any way that I can load the finetuned, quantized model without using the swift library (e.g., by using peft only) for inference?

Your hardware and system info Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息，如CUDA版本，系统，GPU型号和torch版本等) CUDA version: 12.1 GPU: 2x RTX 4090

Package                           Version        Editable project location
--------------------------------- -------------- -----------------------------------------------------------
absl-py                           2.1.0
accelerate                        0.34.2
addict                            2.4.0
aiofiles                          23.2.1
aiohappyeyeballs                  2.4.3
aiohttp                           3.10.9
aiosignal                         1.3.1
aliyun-python-sdk-core            2.15.2
aliyun-python-sdk-kms             2.16.5
altair                            5.4.1
annotated-types                   0.7.0
anyio                             4.6.0
argcomplete                       3.5.0
argon2-cffi                       23.1.0
argon2-cffi-bindings              21.2.0
arrow                             1.3.0
asttokens                         2.4.1
async-lru                         2.0.4
async-timeout                     4.0.3
attrdict                          2.0.1
attrs                             24.2.0
auto_gptq                         0.7.1
av                                13.1.0
babel                             2.16.0
beautifulsoup4                    4.12.3
binpacking                        1.5.2
bitsandbytes                      0.44.1
bleach                            6.1.0
certifi                           2024.8.30
cffi                              1.17.1
charset-normalizer                3.3.2
click                             8.1.7
cloudpickle                       3.0.0
coloredlogs                       15.0.1
comm                              0.2.2
contourpy                         1.3.0
crcmod                            1.7
cryptography                      43.0.1
cycler                            0.12.1
dacite                            1.8.1
datasets                          2.21.0
debugpy                           1.8.6
decorator                         5.1.1
deepseek_vl                       1.0.0
deepspeed                         0.15.1
defusedxml                        0.7.1
dill                              0.3.8
diskcache                         5.6.3
distro                            1.9.0
docker-pycreds                    0.4.0
docstring_parser                  0.16
einops                            0.8.0
exceptiongroup                    1.2.2
executing                         2.0.1
fastapi                           0.114.2
fastjsonschema                    2.20.0
ffmpy                             0.4.0
filelock                          3.16.1
flash-attn                        2.5.9.post1
fonttools                         4.54.1
fqdn                              1.5.1
frozendict                        2.4.5
frozenlist                        1.4.1
fsspec                            2024.6.1
future                            1.0.0
gekko                             1.2.1
gguf                              0.9.1
gitdb                             4.0.11
GitPython                         3.1.43
gradio                            4.26.0
gradio_client                     0.15.1
groundlight                       0.18.0
grpcio                            1.66.2
h11                               0.14.0
hjson                             3.1.0
httpcore                          1.0.6
httptools                         0.6.1
httpx                             0.27.2
huggingface-hub                   0.25.1
humanfriendly                     10.0
idna                              3.10
importlib_metadata                8.5.0
importlib_resources               6.4.5
interegular                       0.3.3
ipykernel                         6.29.5
ipython                           8.27.0
ipywidgets                        8.1.5
isoduration                       20.11.0
isort                             5.13.2
jedi                              0.19.1
jieba                             0.42.1
Jinja2                            3.1.4
jiter                             0.6.0
jmespath                          0.10.0
joblib                            1.4.2
json5                             0.9.25
jsonpointer                       3.0.0
jsonschema                        4.23.0
jsonschema-specifications         2023.12.1
jupyter                           1.1.1
jupyter_client                    8.6.3
jupyter-console                   6.6.3
jupyter_core                      5.7.2
jupyter-events                    0.10.0
jupyter-lsp                       2.2.5
jupyter_server                    2.14.2
jupyter_server_terminals          0.5.3
jupyterlab                        4.2.5
jupyterlab_pygments               0.3.0
jupyterlab_server                 2.27.3
jupyterlab_widgets                3.0.13
kiwisolver                        1.4.7
lark                              1.2.2
llvmlite                          0.43.0
lm-format-enforcer                0.10.6
Markdown                          3.7
markdown-it-py                    3.0.0
MarkupSafe                        2.1.5
matplotlib                        3.9.2
matplotlib-inline                 0.1.7
mdurl                             0.1.2
mistral_common                    1.4.4
mistune                           3.0.2
modelscope                        1.18.1
mpmath                            1.3.0
ms-swift                          2.5.0.dev0     /millcreek/home/bowen/Projects/edge-container-ocr/src/swift
msgpack                           1.1.0
msgspec                           0.18.6
multidict                         6.1.0
multiprocess                      0.70.16
narwhals                          1.9.1
nbclient                          0.10.0
nbconvert                         7.16.4
nbformat                          5.10.4
nest-asyncio                      1.6.0
networkx                          3.3
ninja                             1.11.1.1
nltk                              3.9.1
notebook                          7.2.2
notebook_shim                     0.2.4
numba                             0.60.0
numpy                             1.26.4
nvidia-cublas-cu12                12.1.3.1
nvidia-cuda-cupti-cu12            12.1.105
nvidia-cuda-nvrtc-cu12            12.1.105
nvidia-cuda-runtime-cu12          12.1.105
nvidia-cudnn-cu12                 9.1.0.70
nvidia-cufft-cu12                 11.0.2.54
nvidia-curand-cu12                10.3.2.106
nvidia-cusolver-cu12              11.4.5.107
nvidia-cusparse-cu12              12.1.0.106
nvidia-ml-py                      12.560.30
nvidia-nccl-cu12                  2.20.5
nvidia-nvjitlink-cu12             12.6.77
nvidia-nvtx-cu12                  12.1.105
openai                            1.51.1
optimum                           1.23.0.dev0
orjson                            3.10.7
oss2                              2.19.0
outlines                          0.0.46
overrides                         7.7.0
packaging                         24.1
pandas                            2.2.3
pandocfilters                     1.5.1
parso                             0.8.4
partial-json-parser               0.2.1.1.post4
peft                              0.12.0
pexpect                           4.9.0
pillow                            10.4.0
pip                               24.2
pipx                              1.7.1
platformdirs                      4.2.2
prometheus_client                 0.21.0
prometheus-fastapi-instrumentator 7.0.0
prompt_toolkit                    3.0.48
protobuf                          5.28.2
psutil                            6.0.0
ptyprocess                        0.7.0
pure_eval                         0.2.3
py-cpuinfo                        9.0.0
pyairports                        2.1.1
pyarrow                           17.0.0
pycountry                         24.6.1
pycparser                         2.22
pycryptodome                      3.21.0
pydantic                          2.9.2
pydantic_core                     2.23.4
pydub                             0.25.1
Pygments                          2.18.0
pyparsing                         3.1.4
python-dateutil                   2.9.0.post0
python-dotenv                     1.0.1
python-json-logger                2.0.7
python-multipart                  0.0.12
python-stdnum                     1.20
pytz                              2024.2
PyYAML                            6.0.2
pyzmq                             26.2.0
qwen-vl-utils                     0.0.5
ray                               2.37.0
referencing                       0.35.1
regex                             2024.9.11
requests                          2.32.3
rfc3339-validator                 0.1.4
rfc3986-validator                 0.1.1
rich                              13.9.2
rouge                             1.0.1
rpds-py                           0.20.0
ruff                              0.6.9
safetensors                       0.4.5
scipy                             1.14.1
semantic-version                  2.10.0
Send2Trash                        1.8.3
sentencepiece                     0.2.0
sentry-sdk                        2.15.0
setproctitle                      1.3.3
setuptools                        69.5.1
shellingham                       1.5.4
shtab                             1.7.1
simplejson                        3.19.3
six                               1.16.0
smmap                             5.0.1
sniffio                           1.3.1
sortedcontainers                  2.4.0
soupsieve                         2.6
stack-data                        0.6.3
starlette                         0.38.6
sympy                             1.13.3
tensorboard                       2.18.0
tensorboard-data-server           0.7.2
terminado                         0.18.1
tiktoken                          0.7.0
timm                              1.0.9
tinycss2                          1.3.0
tokenizers                        0.20.0
tomli                             2.0.1
tomlkit                           0.12.0
torch                             2.4.0
torchvision                       0.19.0
tornado                           6.4.1
tqdm                              4.66.5
traitlets                         5.14.3
transformers                      4.45.1
transformers-stream-generator     0.0.5
triton                            3.0.0
trl                               0.11.2
typer                             0.12.5
types-python-dateutil             2.9.0.20241003
typing_extensions                 4.12.2
tyro                              0.8.11
tzdata                            2024.2
uri-template                      1.3.0
urllib3                           1.26.20
userpath                          1.9.2
uvicorn                           0.30.6
uvloop                            0.20.0
vllm                              0.6.1.post2
vllm-flash-attn                   2.6.1
wandb                             0.18.3
watchfiles                        0.24.0
wcwidth                           0.2.13
webcolors                         24.8.0
webencodings                      0.5.1
websocket-client                  1.8.0
websockets                        11.0.3
Werkzeug                          3.0.4
wheel                             0.44.0
widgetsnbextension                4.0.13
xformers                          0.0.27.post2
xxhash                            3.5.0
yarl                              1.13.1
zipp                              3.20.2

Additional context Add any other context about the problem here(在这里补充其他信息)

modelscope / ms-swift

Finetuned Qwen/Qwen2-VL-2B-Instruct-GPTQ-Int4 cannot be loaded with PEFT #2203