Alibaba-MIIL / ML_Decoder

Official PyTorch implementation of "ML-Decoder: Scalable and Versatile Classification Head" (2021)
MIT License
315 stars 52 forks source link

Question about reproducing on the Open Images dataset #41

Closed yankuai closed 2 years ago

yankuai commented 2 years ago

Thank you for your great work! Appreciate your groundbreaking research of multi-label classification on OpenImages-v6.

I had difficulty reproducing your 86.6 mAP on OpenImages. I used the get_datasets_from_csv to get train/val dataset, I passed in a json file with full 9605 classes as train/test classes. The training parameters are set according to the paper. The resulted mAP is 33. I got it from the validate_multi function in train.py.

I validated your pretrained model on openimages( https://miil-public-eu.oss-eu-central-1.aliyuncs.com/model-zoo/ML_Decoder/tresnet_m_open_images_200_groups_86_8.pth) with validate.py, the mAP is 28, while I expect it to be 86.8.

I load the state_dict with model.load_state_dict(state['model'], strict=True), so I think the model is correctly loaded with state_dict. Do you know why the mAP is such low? Thank you!

Leterax commented 2 years ago

Could you maybe share your full training code? Thanks!

giladsharir commented 2 years ago

Hi, can you share the full command you used for training (what arguments did you use)

yankuai commented 2 years ago

Thank you for your reply. My full command is as follows python train_openimages.py --data path/to/openimages --lr 1e-4 --model-name tresnet_m --model-path path/to/tresnet_m_miil_21k.pth --json-path path/to/openimages_class_9605.json --num-classes 9605 --workers 8 --image-size 224 --batch-size 56 --use-ml-decoder 1 --num-of-groups 100 --decoder-embedding 768 --zsl 0

giladsharir commented 2 years ago

https://github.com/Alibaba-MIIL/ML_Decoder/blob/main/MODEL_ZOO.md#open-images-inference-code

Please make sure the arguments you use for training match the ones in the README. One thing I notice is that you used num-of-groups=100, which should be 200

yankuai commented 2 years ago

When I trained on Openimages from scratch, I followed the config of --num-of-groups=100 from the paper Appendices H:

For ML-Decoder, our baseline was group-decoding with K = 100.

So I expect to reproduce the reported mAP 86.8 in Table 12.(I think the num-of-groups for table 12 is 100)

When I evaluated your pretrained 86.8 model, I used the argument num-of-groups=200. The mAP is also not correct.

So I wonder if the evaluation function for COCO and Openimages is different. If so, could you please share your validate file for Openimages? Thank you for your help.

giladsharir commented 2 years ago

you can refer to this link for details of Open-Images dataset: https://github.com/Alibaba-MIIL/PartialLabelingCSL/blob/main/OpenImages.md

also this issue provides info https://github.com/Alibaba-MIIL/ML_Decoder/issues/12

the evaluation/train code for coco and open-images is the same except for the data-loader

yankuai commented 2 years ago

I got it. Thanks again!