microsoft / Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
https://arxiv.org/abs/2103.14030
MIT License
13.87k stars 2.05k forks source link

About ImageNet-21K Pretrain #93

Open SJLeo opened 3 years ago

SJLeo commented 3 years ago

The base tagging of the original imagenet21K is single label. I wonder how to get multi-label information for each image in ImageNet22K.

ancientmooner commented 2 years ago

We use single-label for pre-training.

I have read a paper which converts the original ImageNet21K labels to multiple-label ones, but cannot remember the specific title. I would greatly appreciate if somebody could share the paper under this question.

fffffgggg54 commented 4 months ago

I have read a paper which converts the original ImageNet21K labels to multiple-label ones, but cannot remember the specific title. I would greatly appreciate if somebody could share the paper under this question.

I believe this is the paper: ImageNet-21K Pretraining for the Masses.