Jingkang50 / OpenOOD

Benchmarking Generalized Out-of-Distribution Detection
MIT License
863 stars 110 forks source link

CombOOD results for OpenOOD Leadboard #235

Closed rmagesh148 closed 1 month ago

rmagesh148 commented 6 months ago

Hey Hi!

We would like to submit our CombOOD results to the OpenOOD Leaderboard.

Here's the link to my paper (https://epubs.siam.org/doi/abs/10.1137/1.9781611978032.74),

and the GitHub repository (https://github.com/rmagesh148/combood/tree/main).

Please let me know if you want me to open a PR for the same. Thank you!

Jingkang50 commented 6 months ago

@zjysteven Hi Jingyang, Please help check this request. If you need some csv files (with easy-paste values) from the authors, we can leave the comments here.

zjysteven commented 6 months ago

Hi @rmagesh148 thank you for reaching out.

For the leaderboard, would you mind sharing us direct csv files containing your results for easy integration? You can find examples here https://github.com/Jingkang50/OpenOOD/issues/232, https://github.com/Jingkang50/OpenOOD/pull/207#issuecomment-1826208735, https://github.com/Jingkang50/OpenOOD/pull/193#issue-1952638480.

Also feel free to open a PR if you want to integrate your method into OpenOOD.

rmagesh148 commented 6 months ago

Hi @zjysteven : Please check this link for the results. https://drive.google.com/drive/folders/1Ec0Iz7-i87RS36uHar6j7otDpKBr113I?usp=drive_link , let me know if you have any questions. Thanks!

zjysteven commented 6 months ago

Hi @rmagesh148 it seems that the results are obtained on v1.0 benchmark while the leaderboard assumes that all methods are compared using v1.5 updated benchmark. For example for ImageNet we have refrained from using MNIST as OOD dataset (see more changes at https://github.com/Jingkang50/OpenOOD/wiki/OpenOOD-v1.5-change-log).

Is it possible to rerun your methods with v1.5 benchmark? I imagine this wouldn't require much extra efforts (since CombOOD is post-hoc, right?), and you can find example of easily doing this in this colab notebook.

rmagesh148 commented 6 months ago

@zjysteven : okay, sure! I will rerun for Imagenet and will let you know the results quickly but can you please update the results for other datasets ?

zjysteven commented 6 months ago

There are also some slight changes to the CIFAR benchmarks too in v1.5, so it would be best to rerun on CIFAR10/100 :)

rmagesh148 commented 6 months ago

okay, will be back with the results soon! Thank you for the prompt response.

MdSaifulIslamSajol commented 5 months ago

Hello I am Saiful, co -author of the CombOOD paper.

1) I am trying to download the Imagenet200 dataset using the script shown in the colab (https://colab.research.google.com/drive/1tvTpCM1_ju82Yygu40fy7Lc0L1YrlkQF?usp=sharing). However, I didn't find the key to download Imagenet200 from this file (https://github.com/Jingkang50/OpenOOD/blob/main/scripts/download/download.py). Can you please help me download the dataset ?

2) Also, we are not sure which torchvision.transforms.Normalize() values are used to the OpenOOD_v1.5 for each dataset. Can you please provide us with the values used during training ?

zjysteven commented 5 months ago

Hi Saiful @MdSaifulIslamSajol,

  1. ImageNet200 will be subsampled from ImageNet1K (just 200 subclasses). So actually there is not a "ImageNet200" separately to be downloaded. You only need to have ImageNet1K ready.

  2. The preprocessing is encoded here https://github.com/Jingkang50/OpenOOD/blob/main/openood/preprocessors/test_preprocessor.py, and the normalization statistics are here https://github.com/Jingkang50/OpenOOD/blob/main/openood/preprocessors/transform.py. For ImageNet-1K specifically you can see from our example eval script that the associated preprocessing from torchvision will be used https://github.com/Jingkang50/OpenOOD/blob/18c6f5174a2f518e2a8e819ffb1cd1914bcf12e0/scripts/eval_ood_imagenet.py#L76-L99

MdSaifulIslamSajol commented 4 months ago

Hello, we extracted the 200 classes from imagnet-r dataset (https://github.com/hendrycks/imagenet-r). Then used only those 200 classes to call the imagnet-200 images like you did here .

However, we are getting some discrepancy while checking the test accuracy with your given checkpoint. The test accuracy is very low.

We would like to verify our version of train_imagenet200.txt file matches with your train_imagenet200.txt file. Can you please provide us your [train_imagenet200.txt ] (https://github.com/Jingkang50/OpenOOD/blob/ac5cf1aa51c30a6e89db10c30f4fc2ef94f97f5d/configs/datasets/imagenet200/imagenet200.yml#L19) , as well as val_imagenet200.txt, and test_imagenet200.txt files ?

zjysteven commented 4 months ago

Hi @MdSaifulIslamSajol if you download our benchmark_imglist from here https://drive.google.com/file/d/1XKzBdWCqg3vPoj-D32YixJyJJ0hL63gP/view?usp=drive_link and browse the imagenet200 folder you will see all the txt files.

rmagesh148 commented 1 month ago

Hi @zjysteven @Jingkang50 : Could you please find the results in the drive and let me know if you need anything else from us?

Thank you!

zjysteven commented 1 month ago

@rmagesh148 Thank you! Will integrate the results asap

rmagesh148 commented 1 month ago

Hi @zjysteven ! Could we please have this issue closed once integrated? Thanks!

zjysteven commented 1 month ago

Yes of course. Will close and keep you guys posted. I'm having a very tight schedule recently but will try to do it asap.

rmagesh148 commented 1 month ago

Thank you! @zjysteven

zjysteven commented 1 month ago

Hi @rmagesh148, I'm about to make the integration. A few questions for clarification:

  1. I assume CombOOD is a training-free, post-hoc method, right?
  2. For the CIFAR-10/100 (ResNet-18) and ImageNet-1K (ResNet-50) results, is our released pre-trained model checkpoint used, or any finetuning specific to CombOOD is applied? If so, what are some of the training details so that I can include them in the notes of the entries?
  3. A ResNet-50 is used for ImageNet-200 while currently all other entries use ResNet-18. Please also provide the training details of this ResNet-50 (I guess it would be finetuning ImageNet-1K pretrained checkpoint on ImageNet-200?).
rmagesh148 commented 1 month ago
  • I assume CombOOD is a training-free, post-hoc method, right? - YES
  • For the CIFAR-10/100 (ResNet-18) and ImageNet-1K (ResNet-50) results, is our released pre-trained model checkpoint used, or any finetuning specific to CombOOD is applied? If so, what are some of the training details so that I can include them in the notes of the entries?
  • A ResNet-50 is used for ImageNet-200 while currently all other entries use ResNet-18. Please also provide the training details of this ResNet-50 (I guess it would be finetuning ImageNet-1K pretrained checkpoint on ImageNet-200?).

This goes to both 2 & 3. We haven't fine-tuned any models. We used the models from OpenOOD paper.

zjysteven commented 1 month ago

@rmagesh148 Hi thanks for the confirmation. I have included CombOOD entries in the leaderboard. The near-OOD performance on ImageNet is very impressive. Please take a look and let me know if there are any questions.