sbelharbi / negev

Pytorch implementation of NEGEV method. Paper: "Negative Evidence Matters in Interpretable Histology Image Classification".
GNU General Public License v3.0
4 stars 0 forks source link

I couldn't reach your results in paper #1

Closed Reza-Safdari closed 11 months ago

Reza-Safdari commented 1 year ago

Thanks for sharing your code and doning valuable research in WSOL for histology images. I ran your code with exacty the same hyper-parameters as you mentioned on github, but I couldn't reach your results in paper. In paper, PxAP performance of NEGEV approach with ResNet50 on GlaS dataset is reported 82.0. However, I reached the following result for BEST_LOCALIZATION:

DLL 2022-12-01 17:46:32.010136 - 1:00:49.452464 REPORT EPOCH/74: split: test/metric classification: 98.75 DLL 2022-12-01 17:46:32.010156 - 1:00:49.452484 REPORT EPOCH/74: split: test/metric classification: 98.75_best DLL 2022-12-01 17:46:32.010178 - 1:00:49.452506 REPORT EPOCH/74: split: test/metric localization: 78.56850533742725 DLL 2022-12-01 17:46:32.010198 - 1:00:49.452526 REPORT EPOCH/74: split: test/metric localization: 78.56850533742725_best DLL 2022-12-01 17:46:32.010237 - 1:00:49.452565 REPORT EPOCH/74 split: test: [BEST] PXAP: 78.56850533742725 True positive: 61.31941981950956 False negative: 38.68058018049044 True negative: 77.9749846417977 False positive: 22.0250153582023 Dice foreground: 67.2735510339999 Dice background: 71.34570169405362 MIOU: 53.07061235795351 Best tau: [0.5650000000000001]

Let me know if I missed something.

sbelharbi commented 1 year ago

hi, the configs in readme in github are not the best, but they are an example to how to run the code. both the classifier and negev have to be properly trained. i attached here the pretrained weights (available for 7 days from now): https://www.transfernow.net/dl/20221201HqOt2GYe/1aA2Onej

you can use the classifier weights to init the classifier of negev. or you can use all the weights to re-evaluate the model. i looked into the localization performance log of negev, this is what i got:

PERF - CHECKPOINT best_localization  - EPOCH 219   
REPORT EPOCH/219: split: test/metric classification: 97.5 
REPORT EPOCH/219: split: test/metric classification: 97.5_best 
REPORT EPOCH/219: split: test/metric localization: 82.01381406941411 
REPORT EPOCH/219: split: test/metric localization: 82.01381406941411_best 
REPORT EPOCH/219 split: test: 
PXAP: 82.01381406941411
True positive: 78.57248826880263
False negative: 21.427511731197374
True negative: 72.87876799386387
False positive: 27.121232006136132
Dice foreground: 76.87881877119689
Dice background: 74.60424915287729
MIOU: 60.96831202209647
Best tau: [0.528] 

this is part of the config of the classifier CAM:

time python main_wsol.py --task STD_CL \
                         --encoder_name resnet50 \
                         --arch STDClassifier \
                         --runmode final-mode \
                         --opt__name_optimizer sgd \
                         --batch_size 32 \
                         --eval_checkpoint_type best_localization \
                         --opt__step_size 250 \
                         --opt__gamma 0.1 \
                         --max_epochs 1000 \
                         --freeze_cl False \
                         --support_background True \
                         --method CAM \
                         --spatial_pooling WGAP \
                         --dataset GLAS \
                         --fold 0 \
                         --cudaid 0 \
                         --debug_subfolder None \
                         --amp True \
                         --opt__lr 0.003 \
                         --exp_id 11_19_2021_09_32_36_109051__423849

this is part of negev config:

time python main_wsol.py --task NEGEV \
                         --encoder_name resnet50 \
                         --arch UnetNEGEV \
                         --runmode search-mode \
                         --opt__name_optimizer sgd \
                         --dist_backend mpi \
                         --batch_size 32 \
                         --eval_checkpoint_type best_localization \
                         --opt__step_size 250 \
                         --opt__gamma 0.1 \
                         --max_epochs 1000 \
                         --freeze_cl True \
                         --support_background True \
                         --method CAM \
                         --spatial_pooling WGAP \
                         --dataset GLAS \
                         --fold 0 \
                         --cudaid 0 \
                         --debug_subfolder None \
                         --amp True \
                         --opt__lr 0.1 \
                         --negev_ptretrained_cl_cp best_localization \
                         --elb_init_t 1.0 \
                         --elb_max_t 10.0 \
                         --elb_mulcoef 1.01 \
                         --sl_ng True \
                         --sl_ng_seeder probability_seeder \
                         --sl_ng_lambda 1.0 \
                         --sl_ng_start_ep 0 \
                         --sl_ng_end_ep -1 \
                         --sl_ng_min 1 \
                         --sl_ng_max 1 \
                         --sl_ng_ksz 3 \
                         --crf_ng False \
                         --jcrf_ng False \
                         --neg_samples_ng False \
                         --max_sizepos_ng False \
                         --exp_id 12_13_2021_00_49_48_796469__3314599

you can pick the hyper-parameters from the commands above.

on the same gpu, pytorch is not 100% reproducible. but with this code and config, you will be able to get close to the reported results. thanks

sbelharbi commented 11 months ago

closing. reopen if necessary.