Closed FrancescoSaverioZuppichini closed 2 years ago
Hi @FrancescoSaverioZuppichini,
First of all, thanks a lot for your interest on VISSL!
So the "part-0" and "part-1" in names are due to the introduction of intermediary nn.Sequential
. These sequential blocks representing the "units" of activation checkpointing: each of these sequential block is essentially evaluated twice in forward
and once in backward
to spare memory.
Unfortunately, these blocks "leak" in the names. We could however build something to avoid having those names appear. Is that something that you would find useful?
Thank you, Quentin
Thanks a lot for your quick reply @QuentinDuval . I am very sorry but I don't think I get it :)
I've converted the model and loaded it into RegNetY from classy vision, to do so I've just removed the part-0/1
from the key's dict and it seems to work (correct inference in imagenet). You can find it here
Thanks!
Francesco
Hi @FrancescoSaverioZuppichini,
Yes, reading my previous answer I see it is a bit cryptic... sorry for that. It is VISSL specific and indeed, if you want to load the weights to a different model than VISSL models, then you need to get rid of the "part-X" part for it to work.
I think we need to get rid of these parts and make VISSL "add" the "part" if needed on the fly.
Thank you, Quentin
@QuentinDuval super! Yeah, I wanted to load it into classy vision, glad you confirmed I can remove the part-X
part. It looks like I know what I am doing sometimes :)
Thanks!
francesco
Hello There,
After loading the 10b model from here I can see the keys have
part-0
andpart-1
inside their names, e.g._feature_blocks.res4.block3-part0.block3 ...
and_feature_blocks.res4.block3-part1.block3 ...
. I am wondering why this is the case.Thanks
Cheers,
Francesco