tfjgeorge / nngeometry

{KFAC,EKFAC,Diagonal,Implicit} Fisher Matrices and finite width NTKs in PyTorch
https://nngeometry.readthedocs.io
MIT License
203 stars 20 forks source link

`LayerCollection.from_model` does not get all available layers #60

Closed toontran closed 1 year ago

toontran commented 1 year ago
import torch.nn as nn
from nngeometry.layercollection import LayerCollection

layers = [nn.Flatten(), nn.Linear(28 * 28, 100), nn.ReLU()] + \
          [nn.Linear(100, 100), nn.ReLU()] * 10 + \
          [nn.Linear(100, 10)]
model = nn.Sequential(*layers)
lc = LayerCollection.from_model(model)
lc.layers.items()

Output (only has 3 layers in lc):

odict_items([('1.Linear(in_features=784, out_features=100, bias=True)', <nngeometry.layercollection.LinearLayer object at 0x7e6357f401f0>), ('3.Linear(in_features=100, out_features=100, bias=True)', <nngeometry.layercollection.LinearLayer object at 0x7e6357f43820>), ('23.Linear(in_features=100, out_features=10, bias=True)', <nngeometry.layercollection.LinearLayer object at 0x7e6357f42c80>)])
tfjgeorge commented 1 year ago

Thanks for pointing this out and the fix! I will merge it in 2 weeks.

Thomas

tfjgeorge commented 1 year ago

Actually I think this is the correct behavior, since [nn.Linear(100, 100)] * 10 will stack the same layer (same object instance) 10 times, sharing the same parameters, as opposed to e.g. [nn.Linear(100, 100) for i in range(10)] which will instantiate 10 different layers, with their own parameters.

However the current implementation cannot handle sharing parameters between layers. For instance, recurrent networks won't work.