MiniCPM-Llama3-V 2.5 int4 版本支持微调吗？

myBigbug commented 3 weeks ago

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

[X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

[X] 我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

因为MiniCPM-Llama3-V 2.5 支持微调，但是显卡内存只有24GB，不够使用，所以MiniCPM-Llama3-V 2.5 int4支持微调吗？目前我微调会得到报错ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft for more details

期望行为 | Expected Behavior

No response

复现方法 | Steps To Reproduce

No response

运行环境 | Environment

- OS:Centos7
- Python: 3.10
- Transformers:4.40.0
- PyTorch:2.1.2
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):12.1

pip包
accelerate 0.30.1
addict 2.4.0
aiofiles 23.2.1
altair 5.3.0
annotated-types 0.7.0
anyio 4.4.0
attrs 23.2.0
bitsandbytes 0.42.0
blis 0.7.11
catalogue 2.0.10
certifi 2024.6.2
charset-normalizer 3.3.2
click 8.1.7
cloudpathlib 0.16.0
colorama 0.4.6
confection 0.1.5
contourpy 1.2.1
cycler 0.12.1
cymem 2.0.8
editdistance 0.6.2
einops 0.7.0
et-xmlfile 1.1.0
exceptiongroup 1.2.1
fairscale 0.4.0
fastapi 0.110.3
ffmpy 0.3.2
filelock 3.14.0
fonttools 4.53.0
fsspec 2024.6.0
gradio 4.26.0
gradio_client 0.15.1
h11 0.14.0
httpcore 1.0.5
httpx 0.27.0
huggingface-hub 0.23.3
idna 3.7
importlib_resources 6.4.0
Jinja2 3.1.4
joblib 1.4.2
jsonlines 4.0.0
jsonschema 4.22.0
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
langcodes 3.4.0
language_data 1.2.0
lxml 5.2.2
marisa-trie 1.2.0
markdown-it-py 3.0.0
markdown2 2.4.10
MarkupSafe 2.1.5
matplotlib 3.7.4
mdurl 0.1.2
more-itertools 10.1.0
mpmath 1.3.0
murmurhash 1.0.10
networkx 3.3
nltk 3.8.1
numpy 1.24.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.5.40
nvidia-nvtx-cu12 12.1.105
opencv-python-headless 4.5.5.64
openpyxl 3.1.2
orjson 3.10.4
packaging 23.2
pandas 2.2.2
Pillow 10.1.0
pip 24.0
portalocker 2.8.2
preshed 3.0.9
protobuf 4.25.0
psutil 5.9.8
pydantic 2.7.3
pydantic_core 2.18.4
pydub 0.25.1
Pygments 2.18.0
pyparsing 3.1.2
python-dateutil 2.9.0.post0
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0.1
referencing 0.35.1
regex 2024.5.15
requests 2.32.3
rich 13.7.1
rpds-py 0.18.1
ruff 0.4.8
sacrebleu 2.3.2
safetensors 0.4.3
scipy 1.13.1
seaborn 0.13.0
semantic-version 2.10.0
sentencepiece 0.1.99
setuptools 70.0.0
shellingham 1.5.4
shortuuid 1.0.11
six 1.16.0
smart-open 6.4.0
sniffio 1.3.1
socksio 1.0.0
spacy 3.7.2
spacy-legacy 3.0.12
spacy-loggers 1.0.5
srsly 2.4.8
starlette 0.37.2
sympy 1.12.1
tabulate 0.9.0
thinc 8.2.4
timm 0.9.10
tokenizers 0.19.1
tomlkit 0.12.0
toolz 0.12.1
torch 2.1.2
torchvision 0.16.2
tqdm 4.66.1
transformers 4.40.0
triton 2.1.0
typer 0.9.4
typing_extensions 4.8.0
tzdata 2024.1
urllib3 2.2.1
uvicorn 0.24.0.post1
wasabi 1.1.3
weasel 0.3.4
websockets 11.0.3
wheel 0.43.0

备注 | Anything else?

No response

whyiug commented 3 weeks ago

Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.

1SingleFeng commented 2 weeks ago

Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.

Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.

Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.

我也想知道是如何量化的，请问你得到了吗

myBigbug commented 1 week ago

Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.

Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.

Another question, can you guys (i mean authors) share the quantize scripts? we need the script after sft this model.

我也想知道是如何量化的，请问你得到了吗

No, I'm still waiting

nickyisadog commented 4 days ago

可以啊

在finetune_.lora.sh 改成 MODEL="openbmb/MiniCPM-Llama3-V-2_5-int4"

--tune_vision false --deepspeed ds_config_zero3.json

就可以了

myBigbug commented 4 days ago

可以啊

在finetune_.lora.sh 改成 MODEL="openbmb/MiniCPM-Llama3-V-2_5-int4"

--tune_vision false --deepspeed ds_config_zero3.json

就可以了

@nickyisadog 我是使用finetune_ds.sh脚本不是lora脚本微调int-4模型得到了报错，ValueError: You cannot perform fine-tuning on purely quantized models. Please attach trainable adapters on top of the quantized model to correctly perform fine-tuning. Please see: https://huggingface.co/docs/transformers/peft for more details 。请帮忙分析这是什么原因导致的？

shreyanshu09 commented 2 days ago

@nickyisadog

I am facing this error, RuntimeError: Expected is_sm80 || is_sm90 to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.)

I ran with these changes:

In finetune_.lora.sh, change MODEL="openbmb/MiniCPM-Llama3-V-2_5-int4"

--tune_vision false --deepspeed ds_config_zero3.json

OpenBMB / MiniCPM-V