Open sirluk opened 2 weeks ago
Thanks a lot for investigating this bug, providing a reproducer, and even 2 suggestions to solve this. Option 1 reads like the simpler solution to me. LMK if you are interested in providing a PR to fix this.
thanks for the feedback! Sounds good, I will create a PR for option 1
System Info
peft==0.13.2
Who can help?
@BenjaminBossan
Information
Tasks
examples
folderReproduction
Threre is an issue in the
_create_and_replace
method ofLoraModel
whenrank_pattern
andalpha_pattern
are both set inLoraConfig
.The issue is relating to this line which merges keys from
rank_pattern
andalpha_pattern
to get the target_name_key to retrieve from both dicts.If for example
rank_pattern
is defined with a more general substring (i.e. matching more layers) thanalpha_pattern
the appropriate value fromalpha_pattern
is not retrieved. I assume it is the other way around as well.You can find a minimal example to reproduce the issue here:
Expected behavior
I would expect both print statements to print the same value
2/8=0.25
. However we see that whenrank_pattern
is defined, layer "h.8.attn.c_attn" is not assigned the correct scaling value. I would have two suggestions to fix this:next(filter(lambda key: re.match(rf".*\.{key}$", current_key), pattern_keys), current_key)
twice, once forrank_pattern
keys and once foralpha_pattern
keys.post_init
method ofLoraConfig
to ensure their keys are both defined at the same granularity.