tloen / alpaca-lora

Instruct-tune LLaMA on consumer hardware
Apache License 2.0
18.54k stars 2.22k forks source link

LoRA rank of the models & more metadata #226

Open AngainorDev opened 1 year ago

AngainorDev commented 1 year ago

The readme does list several models, with minimal info. Some are localized, but also the default alpaca ones can use different settings than the default.

While the default LoRA settings from the repo are r=8 alpha=16

The default LoRA for 7b has r=16, alpha=16. The other one is a rank 4. There is no 7B rank 8.

The effect of rank can be signicative, especially on training time (higher rank need more epochs to get the same loss) and likely on quality or overfitting, depending on the training set size. Moreover, when finetuning from an existing LoRA, knowing the rank is critical.

Although all training info is not always present on the HF repo (epochs? cleaned or original dataset?) The rank and alpha info can be found in the config file of the adapter.

tl;dr: Requiring and listing some minimal metadata for each LoRA could be beneficial on the long term. Like a small submission template maybe.

ElleLeonne commented 1 year ago

Is there a good way to increase or decrease the LoRA rank of an existing adapter? Or would it make more sense to merge the weights and continue with a new adapter?

AngainorDev commented 1 year ago

Is there a good way to increase or decrease the LoRA rank of an existing adapter?

Not that I'm aware of. Given the relatively small time it takes to train a LoRA, I'd rather train the needed one myself if it does not already exist.

Merging the LoRA weights back, from what I understand, is not seamless because of precision issues.

Why I think it's important to know how each LoRA was trained, as this allow or not its use as basis for a further tune.

We could have "foundation LoRAs", trained with larger than usually required rank (and epochs/data) to be used as source for custom stuff.

ElleLeonne commented 1 year ago

I trained a LoRA using rank 16, and now I get an error saying that the model is expecting a scalar of type half, but that it got float instead.

Does rank 16 LoRA use mixed precision? Casting it to cuda seems to fix the problem.

AngainorDev commented 1 year ago

Does rank 16 LoRA use mixed precision? Casting it to cuda seems to fix the problem.

Rank is unrelated to model fp precision.

generate now has an extra param --load_8bit (which was the implicit default before).

Without that param, model is loaded in 16 bits which could lead to what you saw.

roshan-gopalakrishnan commented 1 year ago

Can I use larger value of r than lora_alpha ? It says the decomposition of matrix depends on lora_alpha/r. How to choose r and alpha values for finetuning ? Does it depend on models ? Also how to choose task_types and target_modules ?