f-dangel / backpack

BackPACK - a backpropagation package built on top of PyTorch which efficiently computes quantities other than the gradient.
https://backpack.pt/
MIT License
549 stars 55 forks source link

Feature request: `BatchL2Grad` for `LayerNorm` #327

Open f-dangel opened 2 months ago

f-dangel commented 2 months ago

Documenting this feature request from @mf-silva as supporting per-sample L2 gradient norms for LayerNorm allows estimating importance scores for data points on LLM architectures which often have LayerNorm. A good starting point to implement this is to take a look at the custom first-order extension example in the docs.