tfjgeorge / nngeometry

{KFAC,EKFAC,Diagonal,Implicit} Fisher Matrices and finite width NTKs in PyTorch
https://nngeometry.readthedocs.io
MIT License
203 stars 20 forks source link

compute KFAC matrix on big network #53

Closed fmaaf closed 1 year ago

fmaaf commented 1 year ago

hi, have you tried to compute the KFAC matrix on a little big networks such as resnet18? I tried to replace the network by resnet18 in your examples Continual_learning_EWC.ipynb , however it seems that the KFAC matrix is too big to be computed. this is the error: File "/nngeometry/nngeometry/metrics.py", line 171, in FIM return representation(generator=generator, examples=loader) File "/nngeometry_/nngeometry/object/pspace.py", line 439, in init self.data = generator.get_kfacblocks(examples) File "/nngeometry/nngeometry/generator/jacobian/init.py", line 247, in get_kfac_blocks output = self.function(*d).view(bs, self.noutput).sum(dim=0) RuntimeError: shape '[50, 30]' is invalid for input of size 50000

tfjgeorge commented 1 year ago

Hi,

I use a ResNet50 (though without batch norm layers) in this example here: https://github.com/tfjgeorge/nngeometry-examples/blob/main/display_and_timings/Timings%20and%20display%20of%20FIM%20representations.ipynb So size is not the issue in your case. Could you provide me with your code so that I can have a look at it?

On Tue, Oct 18, 2022 at 8:17 AM fmaaf @.***> wrote:

hi, have you tried to compute the KFAC matrix on a little big networks such as resnet18? I tried to replace the network by resnet18 in your examples Continual_learning_EWC.ipynb https://github.com/tfjgeorge/nngeometry-examples/blob/main/Continual_learning_EWC.ipynb , however it seems that the KFAC matrix is too big to be computed. this is the error: File "/nngeometry/nngeometry/metrics.py", line 171, in FIM return representation(generator=generator, examples=loader) File "/nngeometry_/nngeometry/object/pspace.py", line 439, in init self.data = generator.get_kfacblocks(examples) File "/nngeometry/nngeometry/generator/jacobian/init.py", line 247, in get_kfac_blocks output = self.function(*d).view(bs, self.noutput).sum(dim=0) RuntimeError: shape '[50, 30]' is invalid for input of size 50000

— Reply to this email directly, view it on GitHub https://github.com/tfjgeorge/nngeometry/issues/53, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALTMWLHVBGLQNSGQCVBXKDWDY6GHANCNFSM6AAAAAARHXGW2A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

fmaaf commented 1 year ago

ok, I will send by email.

tfjgeorge commented 1 year ago

It looks like you are using a 1000-fold output layer but you task requires only 30 classes.