⬆️ Bump peft from 0.6.0 to 0.9.0

Bumps peft from 0.6.0 to 0.9.0.

Release notes

v0.9.0: Merging LoRA weights, new quantization options, DoRA support, and more

Highlights

New methods for merging LoRA weights together

With PR #1364, we added new methods for merging LoRA weights together. This is not about merging LoRA weights into the base model. Instead, this is about merging the weights from different LoRA adapters into a single adapter by calling add_weighted_adapter. This allows you to combine the strength from multiple LoRA adapters into a single adapter, while being faster than activating each of these adapters individually.

Although this feature has already existed in PEFT for some time, we have added new merging methods that promise much better results. The first is based on TIES, the second on DARE and a new one inspired by both called Magnitude Prune. If you haven't tried these new methods, or haven't touched the LoRA weight merging feature at all, you can find more information here:

Blog post

PEFT docs

Example notebook using diffusers

Example notebook using an LLM

AWQ and AQLM support for LoRA

Via #1394, we now support AutoAWQ in PEFT. This is a new method for 4bit quantization of model weights.

Similarly, we now support AQLM via #1476. This method allows to quantize weights to as low as 2 bits. Both methods support quantizing nn.Linear layers. To find out more about all the quantization options that work with PEFT, check out our docs here.

Note these integrations do not support merge_and_unload() yet, meaning for inference you need to always attach the adapter weights into the base model

DoRA support

We now support Weight-Decomposed Low-Rank Adaptation aka DoRA via #1474. This new method is builds on top of LoRA and has shown very promising results. Especially at lower ranks (e.g. r=8), it should perform much better than LoRA. Right now, only non-quantized nn.Linear layers are supported. If you'd like to give it a try, just pass use_dora=True to your LoraConfig and you're good to go.

Documentation

Thanks to @stevhliu and many other contributors, there have been big improvements to the documentation. You should find it more organized and more up-to-date. Our DeepSpeed and FSDP guides have also been much improved.

Check out our improved docs if you haven't already!

Development

If you're implementing custom adapter layers, for instance a custom LoraLayer, note that all subclasses should now implement update_layer -- unless they want to use the default method by the parent class. In particular, this means you should no longer use different method names for the subclass, like update_layer_embedding. Also, we generally don't permit ranks (r) of 0 anymore. For more, see this PR.

Developers should have an easier time now since we fully embrace ruff. If you're the type of person who forgets to call make style before pushing to a PR, consider adding a pre-commit hook. Tests are now a bit less verbose by using plain asserts and generally embracing pytest features more fully. All of this comes thanks to @akx.

What's Changed

On top of these changes, we have added a lot of small changes since the last release, check out the full changes below. As always, we had a lot of support by many contributors, you're awesome!

Release patch version 0.8.2 by @pacman100 in huggingface/peft#1428

[docs] Polytropon API by @stevhliu in huggingface/peft#1422

... (truncated)

Commits

7e5335d Release: v0.9.0
096fe53 FEAT Implement DoRA (#1474)
90aa2c1 ENH: [Docker] Notify us when docker build pass or fail (#1503)
0173217 FIX Safe merging with LoHa and LoKr (#1505)
aa2ca83 add example and update deepspeed/FSDP docs (#1489)
1b3b7b5 FIX Bug in prompt learning after disabling adapter (#1502)
bc9426f Add default LoRA and IA3 target modules for Gemma (#1499)
3967fcc Allow trust_remote_code for tokenizers when loading AutoPeftModels (#1477)
23213ca AQLM support for LoRA (#1476)
2efc36c Raise error on wrong type for to modules_to_save (#1496)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

caikit / caikit-nlp

⬆️ Bump peft from 0.6.0 to 0.9.0 #330

v0.9.0: Merging LoRA weights, new quantization options, DoRA support, and more

Highlights

New methods for merging LoRA weights together

AWQ and AQLM support for LoRA

DoRA support

Documentation

Development

What's Changed