artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs
https://arxiv.org/abs/2305.14314
MIT License
9.96k stars 820 forks source link

Garbage output of Llama-2-13B-chat model after qlora finetuning #274

Open cywsg opened 11 months ago

cywsg commented 11 months ago

I have finetuned the Llama-2-13B-chat model using lora for a document summarization task. The original text is much longer than the model's context length of 4k. I segmented the text into multiple segments with each less than 3K tokens. I performed model inference after model finetuning (adapter was merged to the base model). There are garbage outputs in some segments such as duplicated (similar) sentences or paragraphs. There are some strange patterns too such as 2 or 3 words repeated sequentially before a full stop. Any idea or thought?