artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs
https://arxiv.org/abs/2305.14314
MIT License
10.06k stars 822 forks source link

Garbage output of Llama-2-13B-chat model after qlora finetuning #274

Open cywsg opened 1 year ago

cywsg commented 1 year ago

I have finetuned the Llama-2-13B-chat model using lora for a document summarization task. The original text is much longer than the model's context length of 4k. I segmented the text into multiple segments with each less than 3K tokens. I performed model inference after model finetuning (adapter was merged to the base model). There are garbage outputs in some segments such as duplicated (similar) sentences or paragraphs. There are some strange patterns too such as 2 or 3 words repeated sequentially before a full stop. Any idea or thought?