facebookresearch / dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.
Apache License 2.0
8.3k stars 699 forks source link

What's the purpose for adding Identity() in chunked blocks? #415

Closed wangh09 closed 1 month ago

wangh09 commented 1 month ago

I saw the following code for using FSDP:

   # this is to keep the block index consistent if we chunk the block list
   chunked_blocks.append([nn.Identity()] * i + blocks_list[i : i + chunksize])

But I can't figure out why the extra Identity()s are necessary here. Does anyone know? Thanks!

qasfb commented 1 month ago

so, as per the comment, this allows retrieving block indices easily when they are wrapped inside a block chunk so instead of having blocks as : blocks.0.[0..8[ blocks.1.[0..8[

you would have blocks as : blocks.0.[0..8[ and blocks.1.[8..16[ where blocks.1.[0..8[ is just padding