Open bghira opened 2 weeks ago
Thanks for opening this feature request. You're not the first one who would like to have this feature. If you or someone else wants to work on the PR, contributions are welcome.
That said, I just wanted to ask if you know that there are already ways to achieve this. One way is to use layers_pattern
, which is used to match the string, and layers_to_transform
, which is a list of indices of the layers you would like to match. More in the docs. It's a bit of an obscure feature but sounds like it is exactly what you require.
The most flexible way to tackle this type of problem is, however, to pass a string to target_modules
. In this case a regex match is applied. This pretty much allows all kinds of fancy patterns, such as matching all proj_out
except for layers.31.proj_out
, just as an example.
oh, i see. so you are suggesting i programmatically search the keys and build a list of targets that match my exclude and target patterns! that is pretty good. maybe it's enough
What I meant is something like this: Say I have facebook/opt-125m
. This model has 12 attention layers. Say I want to match fc1
but only layers 0-10, not 11. I can achieve this by passing:
config = LoraConfig(target_modules=r".*\.(?!11)\d+\.fc1$")
Feature request
The LoraConfig class should accept an optional
exclude_modules
list of regular expressions in an analogue totarget_modules
that then should be referenced when matching against entries fromtarget_modules
.Motivation
Targeting
proj_out
andproj_mlp
on some models greatly improves the results, but we find that the effect is more reliable and robust if the final proj_out and proj_mlp layers are excluded.Your contribution
I can quickly integrate, test, and make use of this feature in bghira/simpletuner
I'm not sure I'll have time to submit a PR in the next two weeks, but after that perhaps I will be able to. Someone who grabs it before then will be much appreciated 💌