Lightning-AI / litgpt

Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
https://lightning.ai
Apache License 2.0
6.85k stars 726 forks source link

Add LongLora for both full and lora fine-tuning #1350

Open belerico opened 3 weeks ago

belerico commented 3 weeks ago

Follow up of #1346.

This PR introduces LongLora as in https://github.com/Lightning-AI/litgpt/issues/1237 for both the LoRa and full fine-tuning, while also enabling it during generation.

cc @rasbt

belerico commented 3 weeks ago

@rasbt to answer your previous question: LongLora is not enabled by default as both longlora_context_length and longlora_n_groups are None, but i agree with you to have a simpler way to enable it. As you suggested i can add a LongLoraArgs as you have done in the galore branch: in this way i can also check those args in a separate function (like the validate_train_args)

rasbt commented 3 weeks ago

Thanks! I think LongLoraArgs might be better, especially if it can be used in multiple approaches, e.g., full and lora

belerico commented 3 weeks ago

I've just trained a model with

python litgpt/finetune/lora.py \
--config=/teamspace/studios/this_studio/litgpt/config_hub/finetune/mistral-7b/longlora.yaml \
--checkpoint_dir=/teamspace/studios/this_studio/litgpt/checkpoints/mistralai/Mistral-7B-Instruct-v0.1

One generation that I've obtained with

python litgpt/generate/base.py \
--checkpoint_dir ../out/finetune/lora-mistral-7b/final \
--prompt="Recommend a movie for me to watch during the weekend and explain the reason." \
--max_new_tokens=128

is the following:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
Recommend a movie for me to watch during the weekend and explain the reason.

### Response:
I recommend the movie "Inception" directed by Christopher Nolan. This is an excellent sci-fi thriller that will keep you on the edge of your seat throughout the entire film. The story follows a professional thief, Dom (played by Leonardo DiCaprio), who is able to steal information from someone's subconscious while they dream. Dom is offered a chance at clemance in exchange for performing the near-impossible task of planting an idea into someone's mind, an act known as Inception.

This movie is a fantastic choice for the weekend because it's not only entertaining, but it also challenges you to think critically about the concepts presented within the film. The plot is twisting and turning, keeping you engaged from beginning to end. Additionally, the special effects and visuals are stunning, making for a truly immersive viewing experience. Moreover, with its all-star cast, including Joseph Gordon-Levitt, Ellen Page, and Tom Hardy, you know you're in for a treat.

Overall, "Inception" is an outstanding choice for the weekend because it provides an exciting and thought-provoking movie experience that is sure to leave a lasting
Time for inference 2: 15.43 sec total, 16.59 tokens/sec
Memory used: 14.67 GB
rasbt commented 3 weeks ago

Nice, this is a good sign that things work!

rasbt commented 3 weeks ago

What are the other options? Are "wte,norm,ln" the only allowed ones or are there more? In the paper the authors have specified that to increase the context length while using LoRA and be effective you also need to fine-tune the embedding layer and every norm layer (ref. Table 2) without specifying anything else. I put there the defaults for the LoRA fine-tuning and leave it to the user the experimentation with other values

Oh sorry, I wasn't clear. I meant more like what are the supported options here? What values can a user typically put in? E.g., analogous to

https://github.com/Lightning-AI/litgpt/blob/b9ddd8bdd8e759702ddb5b624333f422b4e76b5e/litgpt/pretrain.py#L46

But you probably can't use Literal here because of the various combinations within that string. But in the comments, maybe could you mention which of the terms within that comma-separated string are supported?