InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
https://lmdeploy.readthedocs.io/en/latest/
Apache License 2.0
4.75k stars 432 forks source link

[Bug] minicpm-v-2.6量化报错 #2643

Closed sph116 closed 1 month ago

sph116 commented 1 month ago

Checklist

Describe the bug

量化MiniCPM-V-2_6时报错 错误信息如下 Move model.layers.0 to CPU. Move model.layers.1 to CPU. Move model.layers.2 to CPU. Move model.layers.3 to CPU. Move model.layers.4 to CPU. Move model.layers.5 to CPU. Move model.layers.6 to CPU. Move model.layers.7 to CPU. Move model.layers.8 to CPU. Move model.layers.9 to CPU. Move model.layers.10 to CPU. Move model.layers.11 to CPU. Move model.layers.12 to CPU. Move model.layers.13 to CPU. Move model.layers.14 to CPU. Move model.layers.15 to CPU. Move model.layers.16 to CPU. Move model.layers.17 to CPU. Move model.layers.18 to CPU. Move model.layers.19 to CPU. Move model.layers.20 to CPU. Move model.layers.21 to CPU. Move model.layers.22 to CPU. Move model.layers.23 to CPU. Move model.layers.24 to CPU. Move model.layers.25 to CPU. Move model.layers.26 to CPU. Move model.layers.27 to CPU. Move model.norm to GPU. Move model.rotary_emb to GPU. Move lm_head to CPU. Loading calibrate dataset ... Traceback (most recent call last): File "/opt/py3/bin/lmdeploy", line 33, in sys.exit(load_entry_point('lmdeploy', 'console_scripts', 'lmdeploy')()) File "/workdir/lmdeploy-main/lmdeploy/cli/entrypoint.py", line 42, in run args.run(args) File "/workdir/lmdeploy-main/lmdeploy/cli/lite.py", line 131, in auto_awq auto_awq(**kwargs) File "/workdir/lmdeploy-main/lmdeploy/lite/apis/auto_awq.py", line 90, in auto_awq vl_model, model, tokenizer, work_dir = calibrate(model, File "/workdir/lmdeploy-main/lmdeploy/lite/apis/calibrate.py", line 235, in calibrate calibloader, = get_calib_loaders(calib_dataset, File "/workdir/lmdeploy-main/lmdeploy/lite/utils/calib_dataloader.py", line 323, in get_calib_loaders return get_ptb(tokenizer, nsamples, seed, seqlen) File "/workdir/lmdeploy-main/lmdeploy/lite/utils/calib_dataloader.py", line 64, in get_ptb traindata = load_dataset('ptb_text_only', File "/opt/py3/lib/python3.10/site-packages/datasets/load.py", line 2074, in load_dataset builder_instance = load_dataset_builder( File "/opt/py3/lib/python3.10/site-packages/datasets/load.py", line 1832, in load_dataset_builder builder_instance: DatasetBuilder = builder_cls( TypeError: 'NoneType' object is not callable

Reproduction

export HF_ENDPOINT=https://hf-mirror.com
lmdeploy lite auto_awq /workdir/rag_doc_parser/models/MiniCPM-V-2_6 --work-dir /workdir/rag_doc_parser/models/MiniCPM-V-2_6-lmdeploy-awq-int4

Environment

官方docker镜像

Error traceback

No response

sph116 commented 1 month ago

已经尝试单独下载 mit-han-lab/pile-val-backup 数据集 并且可以加载成功 但量化依旧报错

root@user-NF5288M5:/workdir/lmdeploy-main# python Python 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0] on linux Type "help", "copyright", "credits" or "license" for more information.

import os os.environ["HF_ENDPOINT"] = "https://hf-mirror.com" from datasets import load_dataset ds = load_dataset("mit-han-lab/pile-val-backup") Repo card metadata block was not found. Setting CardData to empty. Generating validation split: 214670 examples [00:37, 5766.99 examples/s]

AllentDan commented 1 month ago

https://github.com/InternLM/lmdeploy/blob/v0.6.1/lmdeploy/lite/utils/calib_dataloader.py#L270 替换你本地路径,并使用时候指明你用 pileval 数据集

sph116 commented 1 month ago

https://github.com/InternLM/lmdeploy/blob/v0.6.1/lmdeploy/lite/utils/calib_dataloader.py#L270替换你本地路径,并使用时指明你用pileval数据集

1729760231831 1729760350461 已将两处数据集加载位置修改 报错已改变 Traceback (most recent call last): File "/opt/py3/bin/lmdeploy", line 33, in sys.exit(load_entry_point('lmdeploy', 'console_scripts', 'lmdeploy')()) File "/workdir/lmdeploy-main/lmdeploy/cli/entrypoint.py", line 42, in run args.run(args) File "/workdir/lmdeploy-main/lmdeploy/cli/lite.py", line 131, in auto_awq auto_awq(**kwargs) File "/workdir/lmdeploy-main/lmdeploy/lite/apis/auto_awq.py", line 90, in auto_awq vl_model, model, tokenizer, work_dir = calibrate(model, File "/workdir/lmdeploy-main/lmdeploy/lite/apis/calibrate.py", line 235, in calibrate calibloader, = get_calib_loaders(calib_dataset, File "/workdir/lmdeploy-main/lmdeploy/lite/utils/calib_dataloader.py", line 323, in get_calib_loaders return get_ptb(tokenizer, nsamples, seed, seqlen) File "/workdir/lmdeploy-main/lmdeploy/lite/utils/calib_dataloader.py", line 64, in get_ptb traindata = load_dataset('/root/.cache/huggingface/hub/datasets--ptb_text_only', File "/opt/py3/lib/python3.10/site-packages/datasets/load.py", line 2074, in load_dataset builder_instance = load_dataset_builder( File "/opt/py3/lib/python3.10/site-packages/datasets/load.py", line 1795, in load_dataset_builder dataset_module = dataset_module_factory( File "/opt/py3/lib/python3.10/site-packages/datasets/load.py", line 1573, in dataset_module_factory ).get_module() File "/opt/py3/lib/python3.10/site-packages/datasets/load.py", line 833, in get_module module_name, default_builder_kwargs = infer_module_for_data_files( File "/opt/py3/lib/python3.10/site-packages/datasets/load.py", line 594, in infer_module_for_data_files raise DataFilesNotFoundError("No (supported) data files found" + (f" in {path}" if path else "")) datasets.exceptions.DataFilesNotFoundError: No (supported) data files found in /root/.cache/huggingface/hub/datasets--ptb_text_only

AllentDan commented 1 month ago

raise DataFilesNotFoundError("No (supported) data files found" + (f" in {path}" if path else "")) datasets.exceptions.DataFilesNotFoundError: No (supported) data files found in /root/.cache/huggingface/hub/datasets--ptb_text_only

api 没读上来,可能是你的文件本身有问题,或者路径问题这种。你可以试试用 datasets 的api 自行验证一下。

sph116 commented 1 month ago

引发 DataFilesNotFoundError(“未找到(支持的)数据文件”+(f“在{path}中”如果路径其他“”))datasets.exceptions.DataFilesNotFoundError:未在/root/.cache/huggingface/hub/datasets--ptb_text_only中找到(支持的)数据文件

api没读上来,可能是你的文件本身有问题,或者路径问题之类的。你可以试试用datasets的api自行验证一下。

问题已解决 这个数据集必须直连hg下载才能正常使用 感谢回复

AllentDan commented 1 month ago

OK,关了