unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
16.49k stars 1.14k forks source link

Feature request: Combining train_on_inputs: false + sample packing #735

Open williambarberjr opened 3 months ago

williambarberjr commented 3 months ago

In axolotl, there's a config parameter you can set: train_on_inputs: false

It changes the way the loss is calculated when training a lora -> i.e. it ignores the loss on input tokens and only trains the model on the completion token loss. Essentially allowing the model to concentrate entirely on learning to produce the output, at the cost of not learning to produce the input (which is what I want). If I understand correctly, Huggingface trainer doesn't allow combining this training strategy with sample packing. Kyle Corbitt from OpenPipe (a fine tuning startup) shared this image benchmarking the difference it makes when fine tuning for various different tasks. I'd love to see this feature added to Unsloth as I'm convinced it would help me train significantly better models.

image

Hamel Husain's blog about how to combine custom chat templates with this setting is probably relevant for thinking through how to implement this exactly as it explains how you setup a jsonl input file to define what's input and what's output when the chat template varies across models or your desired inference setup after training: https://hamel.dev/notes/llm/finetuning/09_template_free.html

danielhanchen commented 3 months ago

Oh yep I also saw OpenPipe's experiments! Training on completions only is in TRL, but fails to work on multi turn conversations - so a first step is to support this - in theory I have some code to do that, but I'll need to test it more