facebookresearch / Detic

Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".
Apache License 2.0
1.88k stars 210 forks source link

How to use Imagenet full data? #48

Open JungmoKoo opened 2 years ago

JungmoKoo commented 2 years ago

Thank you for sharing your awesome project. I'm trying to train custom data with Imagenet. In "tools/get_imagenet_21k_full_tar_json.py", you use "metadata-22k" data. But I can not find this data in Imagenet-21K. (I'm using an Imagenet-21K winter) How can I train with all the 21K classes?

anshudaur commented 2 years ago

Hi @JungmoKoo

You need to create full .npy files using preprocess_imagenet22k.py script to train with ImageNet But the preprocessing script expects tarlog files which I am also unable to find/generate.

I was expecting to generate log files using the suggested script in the preprocess_imagenet22k code which points to https://github.com/Alibaba-MIIL/ImageNet21K/blob/main/dataset_preprocessing/processing_script.sh

But processing_script.sh just untars all files in parallel and still doesn't generate any tarlogs.

@imisra and @xingyizhou How can we access/generate the tarlogs ?

Thank you for your help.

Best, Anshu

ma-xu commented 2 years ago

@JungmoKoo @anshudaur Just want to know if you solve this problem? I was trying to reproduce this project (training, inference works greart), but encountered these bugs.

anshudaur commented 2 years ago

@ma-xu I still don't know what the tar log files are...

But I was able to train my model on ImageNet+LVIS which was much easier to prepare if you have the untarred version of ImageNet21K. The scripts provided in 'tools/' just creates common classes/labels folder and keep the images in the respective folders and then you are good to go!

Hope this helps.

doem97 commented 1 year ago

@xingyizhou hi Dr xingyi zhou, may you pls help to point out how to creat the "imagenet/metadata-22k/" folder and the "tar_files.npy"/"tarindex_npy" in it? Faced same prob when reproduce cross dataset results. Many thanks!

jxhuang0508 commented 1 year ago

@ma-xu I still don't know what the tar log files are...

But I was able to train my model on ImageNet+LVIS which was much easier to prepare if you have the untarred version of ImageNet21K. The scripts provided in 'tools/' just creates common classes/labels folder and keep the images in the respective folders and then you are good to go!

Hope this helps.

@anshudaur I am also trying to train Detic with Full ImageNet-21K by using untarred version. May I know how do you get "imagenet-22k_image_info_lvis-22k.json" ? This should be a json file with category information (e.g., image count) and image information (e.g., file path, image id, category id, etc).

Many thanks in advance!

npzl commented 1 year ago

Just want to know if you solve this problem?

@anshudaur @ma-xu @JungmoKoo Have you solved this problem? I meet the same problem. Many thanks!

npzl commented 1 year ago

Just want to know if you solve this problem?

@anshudaur @ma-xu @JungmoKoo Have you solved this problem? I meet the same problem. Many thanks!

@xingyizhou Sorry to boring you.can you give me you methon to gen the tarlog file,many thanks.