unslothai / unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
12.88k stars 838 forks source link

Prevent overfitting the model #713

Open whranclast opened 2 weeks ago

whranclast commented 2 weeks ago

Hello,

I am currently playing the the unsloth library and its performing amazingly, even on my local machine. Unfortunately, I have an issue with the model kind of "forgetting" its generic purpose as I've been training it on "custom task" dataset e.g.

### Instruction:
I have an issue with order 3124
### Response:
Apologies for the issue with order 3124, the reason for the issue is {"id": 3124}"""

Currently my dataset is around 30k entries and I've been using r=8 and lora_alpha=16 to try to prevent overfitting.

However, now when I ask "What is the capital of France", I get hallucinations of the sort of "Apologies for the issue with order 3124, the reason for the issue is {"id": 3124}".

Correct me if I am wrong but running if False: model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "f16", token = "") merges the lora adapters with the pre-trained weights so in a sense we should keep the "old" knowledge of the model.

Maybe reducing the amount of target modules can help ?

danielhanchen commented 2 weeks ago

So to maintain the model's "old" knowledge, I would try adding some generic datasets to the finetune (concat), or reduce the number of steps. Another "trick" is before you merge the lora adapters, simply scale all of them by some fraction (say 0.1) to reduce the effect of the adapters. It might not work very well, but it should function (hopefully)

whranclast commented 2 weeks ago

So to maintain the model's "old" knowledge, I would try adding some generic datasets to the finetune (concat), or reduce the number of steps. Another "trick" is before you merge the lora adapters, simply scale all of them by some fraction (say 0.1) to reduce the effect of the adapters. It might not work very well, but it should function (hopefully)

Thank you, in your second suggestion how to exactly scale them in terms of code ?

danielhanchen commented 2 weeks ago

Oh in the adapter_config.json file, change lora_alpha literally