Closed IvanSedykh closed 1 month ago
Hey, thanks for the suggestion. We already have a method called LayerNorm tuning which is specifically for fine-tuning via layer norms (other types would also work): https://huggingface.co/docs/peft/v0.12.0/en/package_reference/layernorm_tuning#layernorm-tuning.
If you want to combine this, however, with methods like LoRA, you should be able to do this by adding said layers to modules_to_save
in the LoraConfig
.
oh, okay :)
thanks closing the issue then
Feature request
Add a convinient way to unfreeze and later save specific weigths, which are not
nn.Linear
.I wonder, if it would be relevant to implement this functionality in this library. Or it is out of it's scope.
Motivation
While transformers are mostly consist of linear layers, they still has some other parameters(for example scaling weigths in normalization layers). One may benefit from training them. For example, some studies highlighted the importance of normalization layers scaling weigths. There are only few of them, so it would be super cheap to tune them.
Your contribution
Looks like it requires one extra parameter in the config(similar to
target_modules
), and a loop over model parameters to setrequires_grad_(True)
.I would be glad to implement it and open a PR with some little guidance on design choices.