valeoai / obow

Other
96 stars 17 forks source link

which section of imagenet data should be used? #3

Closed oym1994 closed 3 years ago

oym1994 commented 3 years ago

Hi , I am a new research student in DL, could u tell us which section of imagenet should I download to use? Thank you! Download Download ImageNet Data March 11, 2021. Face-blurred ILSVRC 2012–2017 classification data is released. We strongly urge researchers to use this new privacy-aware version.

October 10, 2019: The ILSVRC 2012 classification and localization test set has been updated.

You have been granted access to the the whole ImageNet database through our site. By doing so you agree to the terms of access.

Winter 2021 release ImageNet21K MD5: ab313ce03179fd803a401b02c651c0a2 Processed version of ImageNet21K using the script of "ImageNet-21K pretraining for the masses" ImageNet10K from Deng et al. ECCV2010

*People subtree annotations (FAT 2020).** Description and details Unsafe synsets Imageability annotations Due to sensitivity of the data, the demographic annotations are not available here. Please contact us at imagenet.help.desk@gmail.com to request access.

ImageNet Large-scale Visual Recognition Challenge (ILSVRC) 2017 2016 2015 2014 2013 2012 2011 2010 ILSVRC 2012–2017 evaluation server

Face-blurred ILSVRC2012–2017 classification data is now available (below). We strongly encourage researchers to use this new privacy-aware version for all purposes.

Face obfuscation in ILSVRC Description and details Face annotations Blurred training images Blurred validation images

Object bounding boxes (AAAI 2010). Description and details Download all available

Object attributes (ECCV workshop 2010). Description, details, and download

Download image data for Visual Domain Decathlon(PASCAL in Detail Workshop Challenge) Description and details Decathlon data, 6.1 GB

Download downsampled image data (32x32, 64x64) Description and details Train(8x8), npz format, 227 MB Val(8x8), npz format 9 MB Train(16x16), npz format, 888 MB Val(16x16), npz format, 34 MB Train(32x32), npz format, 3 GB Val(32x32), npz format, 134 MB Train(64x64) part1, npz format, 6 GB Train(64x64) part2, npz format, 6 GB Val(64x64), npz format, 509 MB

Tiny Imagenet(Stanford CS231N) Description and details Tiny, 236 MB

abursuc commented 3 years ago

Hi @oym1994 !

Thank you for the interest in our work. I think your question is not particularly related to this repo. For OBoW we used the original ImageNet dataset for pre-training (we submitted it to CVPR one year ago).

The updated dataset is more privacy-aware, but I would not expect major differences in the final performance of the self-supervised trained models. You can use that one for your experiments.

Best, Andrei