QLoRA微调运行失败[BUG] <title> #1132

Closed xiaohaiqing closed 6 months ago

xiaohaiqing commented 6 months ago

是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答? | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

期望行为 | Expected Behavior

复现方法 | Steps To Reproduce

from peft import AutoPeftModelForCausalLM from transformers import AutoTokenizer

model = AutoPeftModelForCausalLM.from_pretrained("output_qwen", device_map="auto", trust_remote_code=True).eval() tokenizer = AutoTokenizer.from_pretrained("output_qwen", trust_remote_code=True) response, history = model.chat(tokenizer, "类型#裤风格#英伦风格#简约", history=None) print(response)

运行环境 | Environment

- OS:
- Python:
- Transformers:
- PyTorch:
- CUDA (`python -c 'import torch; print(torch.version.cuda)'`):
备注 | Anything else?


jklj077 commented 6 months ago

First, something seems wrong with the vocab_size (which is the size of the embedding, not the actual vocabulary size) in config.json and the pad_to_multiple_of setting.


Second, Target module QuantLinear() is not supported can happen when peft failed to determine the actual AutoGPTQQuantLinear class type, e.g., auto-gptq is not correctly installed or quantization_config is not set accordingly.

其次,Target module QuantLinear() is not supported的错误通常意味着peft无法正确识别到AutoGPTQQuantLinear类的实际类型,例如可能是auto-gptq未被正确安装,或者是quantization_config设置不恰当导致的。

I would suggest provide the content of config.json, lora_config.json, and tokenizer_config.json first, and try downgrading peft<0.8.0 and reinstalling auto-gptq according to their documentation at https://github.com/AutoGPTQ/AutoGPTQ/blob/main/docs/INSTALLATION.md.


xiaohaiqing commented 6 months ago



/home/user/anaconda3/envs/qwen-7b-single/lib/python3.11/site-packages/transformers/utils/generic.py:260: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
Try importing flash-attention for faster inference...
Warning: import flash_attn rotary fail, please install FlashAttention rotary to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/rotary
Warning: import flash_attn rms_norm fail, please install FlashAttention layer_norm to get higher efficiency https://github.com/Dao-AILab/flash-attention/tree/main/csrc/layer_norm
Warning: import flash_attn fail, please install FlashAttention to get higher efficiency https://github.com/Dao-AILab/flash-attention
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:03<00:00,  1.21s/it]
Traceback (most recent call last):
  File "/home/xhq/Qwen/finetune/testQloRA.py", line 4, in <module>
    model = AutoPeftModelForCausalLM.from_pretrained("output_qwen", device_map="auto", trust_remote_code=True).eval()
  File "/home/user/anaconda3/envs/qwen-7b-single/lib/python3.11/site-packages/peft/auto.py", line 124, in from_pretrained
    tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path)
  File "/home/user/anaconda3/envs/qwen-7b-single/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 724, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class QWenTokenizer does not exist or is not currently imported.
(qwen-7b-single) [root@adsl-172-10-0-187 finetune]#

peft降级到0.7.0我也试了,错误信息还是跟问题中的一样,我的cuda版本是12.1,auto-gptq版本对应0.7.1我看是没问题的,请问这个可能是什么原因呢? 下边是我的config.json、adapter_config.json、tokenizer_config.json adapter_config.json config.json tokenizer_config.json

jklj077 commented 6 months ago

ValueError: Tokenizer class QWenTokenizer does not exist or is not currently imported.

There is a bug in peft 0.8.0 that affects models requiring trust_remote_code=True; please make sure peft<0.8.0.

(In AutoPeftModelForCausalLM.from_pretrained, AutoTokenizer.from_pretrained is called without passing trust_remote_code.)

The same error for peft 0.7.0.

Please double check the version of peft; the error should not occur at the same line. If it does, something is very wrong with your environment.

auto-gptq 0.7.1

The auto-gptq from PyPI is built against PyTorch 2.2.1+cu121. I think you have PyTorch 2.2.0. (Please try pip install auto-gptq==0.7.0) In addition, transformers is too old to support this version of auto-gptq.

The configuration JSON files actually look good to me. To rule out a misconfigured environment, which is my best guess, please try using the provided docker image.

Did you use the same environment to finetune the model?