What's part-0 and part-1 in RegNet 10B trained with seer

facebookresearch / vissl

VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.

https://vissl.ai

MIT License

3.25k stars 332 forks source link

What's part-0 and part-1 in RegNet 10B trained with seer #538

Closed FrancescoSaverioZuppichini closed 2 years ago

FrancescoSaverioZuppichini commented 2 years ago

Hello There,

After loading the 10b model from here I can see the keys have part-0 and part-1 inside their names, e.g. _feature_blocks.res4.block3-part0.block3 ... and _feature_blocks.res4.block3-part1.block3 .... I am wondering why this is the case.

Thanks

Cheers,

Francesco

QuentinDuval commented 2 years ago

Hi @FrancescoSaverioZuppichini,

First of all, thanks a lot for your interest on VISSL!

So the "part-0" and "part-1" in names are due to the introduction of intermediary nn.Sequential. These sequential blocks representing the "units" of activation checkpointing: each of these sequential block is essentially evaluated twice in forward and once in backward to spare memory.

Unfortunately, these blocks "leak" in the names. We could however build something to avoid having those names appear. Is that something that you would find useful?

Thank you, Quentin

FrancescoSaverioZuppichini commented 2 years ago

Thanks a lot for your quick reply @QuentinDuval . I am very sorry but I don't think I get it :)

I've converted the model and loaded it into RegNetY from classy vision, to do so I've just removed the part-0/1 from the key's dict and it seems to work (correct inference in imagenet). You can find it here

Thanks!

Francesco

QuentinDuval commented 2 years ago

Hi @FrancescoSaverioZuppichini,

Yes, reading my previous answer I see it is a bit cryptic... sorry for that. It is VISSL specific and indeed, if you want to load the weights to a different model than VISSL models, then you need to get rid of the "part-X" part for it to work.

I think we need to get rid of these parts and make VISSL "add" the "part" if needed on the fly.

Thank you, Quentin

FrancescoSaverioZuppichini commented 2 years ago

@QuentinDuval super! Yeah, I wanted to load it into classy vision, glad you confirmed I can remove the part-X part. It looks like I know what I am doing sometimes :)

Thanks!

francesco