Closed zhanggang001 closed 2 years ago
@zhanggang001 Thanks for your interest!
I'm sorry, but I don't remember the reason why I used syncBN because it's been a long time.
At that time, I just followed other methods for the downstream tasks.
@zhanggang001 Thanks for your interest!
I'm sorry, but I don't remember the reason why I used syncBN because it's been a long time.
At that time, I just followed other methods for the downstream tasks.
But the other methods do not use SyncBN for the downstream tasks, and they use the same normalization layers as pre-trained models. Anyway, thanks for your reply.
@zhanggang001 I'll let you know when I find the comparison.
Hi, @youngwanLEE. So great work and the performance of the downstream tasks is promising.
I notice that you use BN in patch embedding layers and conv_stem, then use layer norm in MHCABlock; When transferring the pre-trained model to the downstream tasks, you replace the BN layer with SyncBN;
So I wonder that how many improvements can this bring by replacing the BN layer with SyncBN? for COCO detection and ADE20k semantic segmentation respectively? Thanks in advance.