facebookresearch / deit

Official DeiT repository
Apache License 2.0
4.03k stars 554 forks source link

Questions about PatchConvNet's stem #151

Closed gau-nernst closed 2 years ago

gau-nernst commented 2 years ago

First off, thank you for open-sourcing the amazing work.

I have some questions regarding PatchConvNet's stem

  1. The code shows that there is no GELU activation at the last conv layer. Is this done intentionally? Figure 3 in the paper would need an update since it includes that last GELU activation.
  2. There is no bias in the convolution layers of the stem. Again, is this intentional? I don't think it will matter much, but I just want to confirm.

Thank you!

TouvronHugo commented 2 years ago

Hi @gau-nernst, Thanks for your questions. 1) The code of PatchConvNet is correct. There is indeed a typo in Figure 3. we will correct it in the next update of the paper. 2) Yes it's intentional. In this paper we use the same ConvStem as in our papers LeViT and XCiT. Best, Hugo