huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
https://huggingface.co/docs/peft
Apache License 2.0
15.82k stars 1.53k forks source link

exclude_modules to keep specific layers or other quirky components out of a target_modules selection #2044

Open bghira opened 2 weeks ago

bghira commented 2 weeks ago

Feature request

The LoraConfig class should accept an optional exclude_modules list of regular expressions in an analogue to target_modules that then should be referenced when matching against entries from target_modules.

Motivation

Targeting proj_out and proj_mlp on some models greatly improves the results, but we find that the effect is more reliable and robust if the final proj_out and proj_mlp layers are excluded.

Your contribution

I can quickly integrate, test, and make use of this feature in bghira/simpletuner

I'm not sure I'll have time to submit a PR in the next two weeks, but after that perhaps I will be able to. Someone who grabs it before then will be much appreciated 💌

BenjaminBossan commented 2 weeks ago

Thanks for opening this feature request. You're not the first one who would like to have this feature. If you or someone else wants to work on the PR, contributions are welcome.

That said, I just wanted to ask if you know that there are already ways to achieve this. One way is to use layers_pattern, which is used to match the string, and layers_to_transform, which is a list of indices of the layers you would like to match. More in the docs. It's a bit of an obscure feature but sounds like it is exactly what you require.

The most flexible way to tackle this type of problem is, however, to pass a string to target_modules. In this case a regex match is applied. This pretty much allows all kinds of fancy patterns, such as matching all proj_out except for layers.31.proj_out, just as an example.

bghira commented 2 weeks ago

oh, i see. so you are suggesting i programmatically search the keys and build a list of targets that match my exclude and target patterns! that is pretty good. maybe it's enough

BenjaminBossan commented 2 weeks ago

What I meant is something like this: Say I have facebook/opt-125m. This model has 12 attention layers. Say I want to match fc1 but only layers 0-10, not 11. I can achieve this by passing:

config = LoraConfig(target_modules=r".*\.(?!11)\d+\.fc1$")