Request for more detail for pre-training on ImageNet-21k

google-research / big_transfer

Official repository for the "Big Transfer (BiT): General Visual Representation Learning" paper.

https://arxiv.org/abs/1912.11370

Apache License 2.0

1.5k stars 175 forks source link

Request for more detail for pre-training on ImageNet-21k #26

Closed junsukchoe closed 4 years ago

junsukchoe commented 4 years ago

Hi there!

I'd like to know the detail about how to train the network on ImageNet-21k. I carefully checked the paper and this repository, and I discovered lots of detail, but I cannot find which loss function is used for ImageNet-21k. In addition, I wonder how you process the multi-class labels for training. And also, it would be great if you give a detail of how to handle the unbalanced data distribution of ImageNet-21k. Could you tell me a bit more about those?

Thanks,

kracwarlock commented 4 years ago

Also can you provide details on what data pre-processing was used for pre-training? @lucasb-eyer

lucasb-eyer commented 4 years ago

Sure thing.

The loss function is sigmoid cross-entropy, i.e. a binary classifier for each label.
I'm not sure what you mean by "process the labels for training".
No special handling of imbalanced data.
Pre-processing for pre-training was standard "inception crop" resized to 224x224, random left/right flip, and rescale pixel values to [-1, 1]

hope this helps!

junsukchoe commented 4 years ago

Thanks a lot!

kracwarlock commented 4 years ago

@lucasb-eyer Can you clarify standard inception crop? For training and evaluation both?

kracwarlock commented 4 years ago

Also how did you create the train, valid and test splits?