tfjgeorge / nngeometry

{KFAC,EKFAC,Diagonal,Implicit} Fisher Matrices and finite width NTKs in PyTorch
https://nngeometry.readthedocs.io
MIT License
206 stars 20 forks source link

Scaling of parameter space representations #68

Closed ksnxr closed 1 year ago

ksnxr commented 1 year ago

Many thanks for this interesting library!

Comparing with analytical expressions, I think the provided dense representation of Fisher information matrix is calculated as the expectation over the data points in the train loader. Are the other representations, e.g. KFAC and EKFAC, on the same scale? Or, is there a constant scaling, e.g. by the batch size, that we should be aware of?

tfjgeorge commented 1 year ago

Everybody is at the very same scale. I am curious why are you asking?

On Sun, Sep 3, 2023, 00:26 ksnxr @.***> wrote:

Many thanks for this interesting project!

Comparing with analytical expressions, I think the provided dense representation of Fisher information matrix is calculated as the expectation over the data points in the train loader. Are the other representations, e.g. KFAC and EKFAC, on the same scale? Or, is there a constant scaling, e.g. by the number of batch size, that we should be aware of?

— Reply to this email directly, view it on GitHub https://github.com/tfjgeorge/nngeometry/issues/68, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALTMWKBY5RC55P45YEC2QTXYOXAVANCNFSM6AAAAAA4I3TQQM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ksnxr commented 1 year ago

Thanks for your quick response. I have a use case where it is necessary to have approximations that are supposed to be of the same scale as the analytical Fisher; Since I couldn't find something about this in the project, I figure it might be better to verify this