unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
18.37k stars 1.28k forks source link

Finetuned Llama 3.1 8B (base) gets stuck in a loop #1275

Open skerit opened 1 week ago

skerit commented 1 week ago

I fine-tuned Llama 3.1 8B on 1 epoch of 36.000 samples, with the sample token length ranging from 1000 to 20.000 tokens. When looking at the average length of a sample, it's only around 2000 tokens though. There are 1600 samples that are over 5000 tokens in length.

I'm training on completions only. I am teaching it my own, custom prompt format. There are over 10.000 samples where the completion is over 1000 tokens long.

I'm using a 128 rank, 256 alpha. My batch size is 1, while my gradient accumulation is 8.

Loss

The train loss & eval loss seemed to do OK. On average, train loss went from over 1.4 to 1.23 Eval loss went from 1.18 to 0.96

image image image

Testing it

But when I actually finally inference something (a sample that was even in the training data), it just starts to repeat itself very, very quickly:

For example:

I woke up with a start. I was sweating. I looked at the clock. It was 3:00 AM. I looked at the phone. I had 100 notifications.
I looked at the first one. It read "DO NOT LOOK AT THE MOON".
I looked at the second one. It read "It's a beautiful night tonight. Look outside."
I looked at the third one. It read "It's a beautiful night tonight. Look outside."
I looked at the fourth one. It read "It's a beautiful night tonight. Look outside."
I looked at the fifth one. It read "It's a beautiful night tonight. Look outside."
...

And it goes on and on. I can easily make it write other stories that seem fine for a few sentences, then start to repeat themselves in some way after a while.

So is something wrong with finetuning on longer outputs? Or do I still not have enough data? Or does finetuning a base model just require a lot more data?

danielhanchen commented 1 week ago

@skerit Apologies on the delay - it's best to ask on our Discord server :)

skerit commented 1 week ago

@skerit Apologies on the delay - it's best to ask on our Discord server :)

No worries!

I actually did, but nobody seems to know there too :sweat_smile: