ChengyueGongR / PatchVisionTransformer

72 stars 13 forks source link

Difference of swin #4

Open jinseok-karl opened 3 years ago

jinseok-karl commented 3 years ago

Hi, thanks for sharing code! I'd like to try your code with mmsegmentation. But I can't find which part is the different with original swin. Shortly, I don't know where the diverse part is

Sincerely

ChengyueGongR commented 3 years ago

Hi, The only difference is the ImageNet pretrained part. For segmentation, we only change the pretrained checkpoint. We do not apply our loss for segmentation. The reason is that, semantic segmentation itself has already provided dense local labels. We have uploaded some checkpoints for segmentation. Training SWIN on ImageNet uses a similar implementation as DeiT, and we will upload the code.

Yours, Chengyue

zz7379 commented 3 years ago

Could you explain "The reason is that, semantic segmentation itself has already provided dense local labels." I am confused

ChengyueGongR commented 3 years ago

Hi, one of our motivations is to provide local labels for each token. For segmentation, the local label is already very dense and therefore we do not add our regularization.