Question about the norm layer

youngwanLEE / MPViT

[CVPR 2022] MPViT:Multi-Path Vision Transformer for Dense Prediction

https://arxiv.org/abs/2112.11010

Other

364 stars 40 forks source link

Question about the norm layer #8

Closed zhanggang001 closed 2 years ago

zhanggang001 commented 2 years ago

Hi, @youngwanLEE. So great work and the performance of the downstream tasks is promising.

I notice that you use BN in patch embedding layers and conv_stem, then use layer norm in MHCABlock; When transferring the pre-trained model to the downstream tasks, you replace the BN layer with SyncBN;

So I wonder that how many improvements can this bring by replacing the BN layer with SyncBN? for COCO detection and ADE20k semantic segmentation respectively？ Thanks in advance.

youngwanLEE commented 2 years ago

@zhanggang001 Thanks for your interest!

I'm sorry, but I don't remember the reason why I used syncBN because it's been a long time.

At that time, I just followed other methods for the downstream tasks.

zhanggang001 commented 2 years ago

@zhanggang001 Thanks for your interest!

I'm sorry, but I don't remember the reason why I used syncBN because it's been a long time.

At that time, I just followed other methods for the downstream tasks.

But the other methods do not use SyncBN for the downstream tasks, and they use the same normalization layers as pre-trained models. Anyway, thanks for your reply.

youngwanLEE commented 2 years ago

@zhanggang001 I'll let you know when I find the comparison.