google-research / vision_transformer

Apache License 2.0
10.22k stars 1.27k forks source link

Will models pretrained on JFT300M dataset be released ? #9

Open jiahaolu97 opened 3 years ago

jiahaolu97 commented 3 years ago

Hi, thank you for sharing code and models ! I'm wondering if you have any plans for releasing models pretrained on JFT-300M dataset. Many thanks !

andsteing commented 3 years ago

Unfortunately, this is an internal dataset and we are not allowed to release these weights. ImageNet-21k was in anticipation for this, and we are actively looking at public datasets larger than ImageNet-21k in order to be able to release even better pre-trained models.

Happy to hear recommendations for large, public datasets we could try!

chris-ha458 commented 3 years ago

@andsteing what are your thoughts on open images dataset v6 with or without extensions? The imageset does seem to be geared more towards image segmentation tasks but then again it is a google project so maybe certain cooperation could be done in house.

Considering its permissive license and large size I think it might be a good option.

akolesnikoff commented 3 years ago

@VRandme this is a good suggestion and we may try pretraining on this dataset in the future, though it is not our TODO list at the moment.

Note, that it is smaller than ImageNet21k dataset and may not yield significantly better results than already released models.

chaoyanghe commented 3 years ago

In your paper, JFT300M is JFT (Sun et al., 2017) with 18k classes and 303M high-resolution images. Isn't this an open dataset? then could you help to share the weights for us.