Reproducing evaluation results

Michiexb commented 3 years ago

Hi! I'm having some trouble with reproducing your evaluation results. I've run your eval bash file for all models with the checkpoint files that I downloaded from the link in the Readme, but the results are very different from the ones mentioned in your paper.

These are the accuracies I get: beta_1: --- I get 25.30 % --- Should be 67.30 % beta_2: --- I get 0.56 % ---- Should be 71.73 % beta_4: --- I get 0.21 % ---- Should be 73.69 % beta_8: --- I get 0.12 % ---- Should be 74.59 % beta_16: -- I get 0.76 % ---- Should be 75.54 % beta_32: -- I get 36.81 % --- Should be 76.18 % beta_inf: -- I get 74.40 % --- Should be 76.27 %

Do you have any idea why this might be? Because I would love to use your INN, but of course would need a better accuracy than what I am getting now.

ZY123-GOOD commented 2 years ago

Hello, I wonder if you know how to solve this problem.

Michiexb commented 2 years ago

I didn't fully reproduce the results yet since my time is currently more important to me than the accuracy of the model, but I did train the model for 10 more epochs starting from the checkpoint file (by setting resume_checkpoint in the config to True), and that did give me 67.55% for the beta_8 model. So probably, the checkpoints that they shared are not the fully trained checkpoints from the paper.

ZY123-GOOD commented 2 years ago

Thank you for your reply. Best wishes.

---Original--- From: @.> Date: Mon, Oct 18, 2021 19:15 PM To: @.>; Cc: @.**@.>; Subject: Re: [RayDeeA/ibinn_imagenet] Reproducing evaluation results (#7)

I didn't fully reproduce the results yet since my time is currently more important to me than the accuracy of the model, but I did train the model for 10 more epochs starting from the checkpoint file (by setting resume_checkpoint in the config to True), and that did give me 67.55% for the beta_8 model. So probably, the checkpoints that they shared are not the fully trained checkpoints from the paper.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

RayDeeA commented 2 years ago

Hi, how are evaluating the model?

Michiexb commented 2 years ago

The way that it's mentioned in the readme

RayDeeA commented 2 years ago

Ok, that is strange. I’ll clone and check.

ZY123-GOOD commented 2 years ago

Ok, that is strange. I’ll clone and check.

Thank you! Please check it. I am looking forward to using your INN in our new work.

ZY123-GOOD commented 2 years ago

Ok, that is strange. I’ll clone and check.

Today I also get that beta_1: --- I get 25.30 % --- Should be 67.30 %. Could you please correct it in your busy schedule?

jenellefeather commented 1 year ago

I know this is an old issue and its possible the repo is no longer being maintained, however I am getting the same low performance results for the downloaded checkpoints (36.948% for beta_32).

Is there any update on the discrepancy?

craymichael commented 6 months ago

I am also getting the same issue with the exact same accuracies reported in the issue.

Conda (23.1.0) Environment:

torch==1.7.1
FrEIA at tag v0.2
Python 3.9.18

The command I run to evaluate the model:

python -m ibinn_imagenet.eval.ibinn_imagenet_classifier \
    --model_file_path=checkpoints/beta_2.avg.pt \
    --evaluation=accuracy \
    --data_root_folder_val='/mnt/data/imagenet/' \
    --data_root_folder_train='/mnt/data/imagenet/' \
    --model_n_loss_dims_1d 3072 \
    --data_batch_size 32

RayDeeA / ibinn_imagenet

Reproducing evaluation results #7