clovaai / wsolevaluation

Evaluating Weakly Supervised Object Localization Methods Right (CVPR 2020)
MIT License
332 stars 55 forks source link

Re-implementation confusion #35

Closed TyroneLi closed 4 years ago

TyroneLi commented 4 years ago

Hi, I used your config params to train resnet50 vanilla cam, I cannot reach your reported accuracy. Here's my configurations: CUDA_VISIBLE_DEVICES=6 python train.py --dataset_name CUB --architecture resnet50 --wsol_method cam --experiment_name CUB_CAM_resnet50_box_v2_metric --pretrained TRUE --large_feature_map FALSE --batch_size 32 --epochs 50 --lr 0.0002 --lr_decay_frequency 15 --weight_decay 0.0001 --override_cache TRUE --workers 4 --box_v2_metric True --iou_threshold_list 30 50 70 --eval_checkpoint_type last --data_root /data/lijinlong/datasets/CUB-200-2011/ result:

Final epoch evaluation on test set ... Check train_log/CUB_CAM_resnet50_box_v2_metric/last_checkpoint.pth.tar loaded. rank 0, Evaluate epoch 50, split test Computing and evaluating cams. Split train, metric loss, current value: 0.07756523653730615 Split train, metric loss, best value: 0.07533820327974217 Split train, metric loss, best epoch: 48 Split train, metric classification, current value: 99.84984984984985 Split train, metric classification, best value: 99.88321654988322 Split train, metric classification, best epoch: 43 Split val, metric classification, current value: 72.89999999999999 Split val, metric classification, best value: 74.2 Split val, metric classification, best epoch: 30 Split val, metric localization, current value: 46.36666666666667 Split val, metric localization, best value: 50.900000000000006 Split val, metric localization, best epoch: 1 Split val, metric localization_IOU_30, current value: 89.1 Split val, metric localization_IOU_30, best value: 92.6 Split val, metric localization_IOU_30, best epoch: 2 Split val, metric localization_IOU_50, current value: 43.9 Split val, metric localization_IOU_50, best value: 51.6 Split val, metric localization_IOU_50, best epoch: 1 Split val, metric localization_IOU_70, current value: 6.1 Split val, metric localization_IOU_70, best value: 8.9 Split val, metric localization_IOU_70, best epoch: 1 Split test, metric classification, current value: 77.06247842595789 Split test, metric localization, current value: 51.26567713726845 Split test, metric localization_IOU_30, current value: 95.11563686572316 Split test, metric localization_IOU_50, current value: 50.465999309630654 Split test, metric localization_IOU_70, current value: 8.215395236451501

CUDA_VISIBLE_DEVICES=5 python train.py --dataset_name CUB --architecture resnet50 --wsol_method cam --experiment_name CUB_CAM_resnet50 --pretrained TRUE --large_feature_map FALSE --batch_size 32 --epochs 50 --lr 0.0002 --lr_decay_frequency 15 --weight_decay 0.0001 --override_cache TRUE --workers 4 --box_v2_metric False --iou_threshold_list 30 50 70 --eval_checkpoint_type last --data_root /data/lijinlong/datasets/CUB-200-2011/ results:

Final epoch evaluation on test set ... Check train_log/CUB_CAM_resnet50/last_checkpoint.pth.tar loaded. rank 0, Evaluate epoch 50, split test Computing and evaluating cams. Split train, metric loss, current value: 0.078823547021007 Split train, metric loss, best value: 0.07638261178592304 Split train, metric loss, best epoch: 45 Split train, metric classification, current value: 99.76643309976645 Split train, metric classification, best value: 99.83316649983317 Split train, metric classification, best epoch: 43 Split val, metric classification, current value: 73.2 Split val, metric classification, best value: 74.0 Split val, metric classification, best epoch: 18 Split val, metric localization, current value: 43.5 Split val, metric localization, best value: 52.6 Split val, metric localization, best epoch: 1 Split val, metric localization_IOU_30, current value: 88.6 Split val, metric localization_IOU_30, best value: 93.2 Split val, metric localization_IOU_30, best epoch: 1 Split val, metric localization_IOU_50, current value: 43.5 Split val, metric localization_IOU_50, best value: 52.6 Split val, metric localization_IOU_50, best epoch: 1 Split val, metric localization_IOU_70, current value: 6.0 Split val, metric localization_IOU_70, best value: 9.4 Split val, metric localization_IOU_70, best epoch: 1 Split test, metric classification, current value: 76.61373835001726 Split test, metric localization, current value: 50.84570245081118 Split test, metric localization_IOU_30, current value: 95.11563686572316 Split test, metric localization_IOU_50, current value: 50.84570245081118 Split test, metric localization_IOU_70, current value: 8.439765274421816

Here is my model architecture and config params: Namespace(acol_threshold=0.7, adl_drop_rate=0.75, adl_threshold=0.9, architecture='resnet50', architecture_type='cam', batch_size=32, box_v2_metric=True, cam_curve_interval=0.001, crop_size=224, cutmix_beta=1.0, cutmix_prob=1.0, data_paths=Munch({'train': '/data/lijinlong/datasets/CUB-200-2011/CUB', 'val': '/data/lijinlong/datasets/CUB-200-2011/CUB', 'test': '/data/lijinlong/datasets/CUB-200-2011/CUB'}), data_root='/data/lijinlong/datasets/CUB-200-2011/', dataset_name='CUB', dist_backend='nccl', dist_url='tcp://127.0.0.1', epochs=50, eval_checkpoint_type='last', experiment_name='CUB_CAM_resnet50_box_v2_metric', gpu=None, has_drop_rate=0.5, has_grid_size=4, iou_threshold_list=[30, 50, 70], large_feature_map=False, launcher='pytorch', local_rank=0, log_folder='train_log/CUB_CAM_resnet50_box_v2_metric', lr=0.0002, lr_classifier_ratio=10, lr_decay_frequency=15, mask_root='dataset/OpenImages', master_port='47562', metadata_root='metadata/CUB', momentum=0.9, multi_contour_eval=True, multi_iou_eval=True, multiprocessing_distributed=False, num_val_sample_per_class=0, override_cache=True, pretrained=True, pretrained_path=None, proxy_training_set=False, rank=-1, reporter=<class 'util.Reporter'>, reporter_log_root='train_log/CUB_CAM_resnet50_box_v2_metric/reports', resize_size=256, scoremap_paths=Munch({'train': 'train_log/CUB_CAM_resnet50_box_v2_metric/scoremaps/train', 'val': 'train_log/CUB_CAM_resnet50_box_v2_metric/scoremaps/val', 'test': 'train_log/CUB_CAM_resnet50_box_v2_metric/scoremaps/test'}), seed=None, spg_threshold_1h=0.7, spg_threshold_1l=0.01, spg_threshold_2h=0.5, spg_threshold_2l=0.05, spg_threshold_3h=0.7, spg_threshold_3l=0.1, spg_thresholds=((0.7, 0.01), (0.5, 0.05), (0.7, 0.1)), weight_decay=0.0001, workers=4, world_size=-1, wsol_method='cam') Loading model resnet50 ` ResNetCam( (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False) (layer1): Sequential( (0): Bottleneck( (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (downsample): Sequential( (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) (2): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) ) (layer2): Sequential( (0): Bottleneck( (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (downsample): Sequential( (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) (2): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) (3): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) ) (layer3): Sequential( (0): Bottleneck( (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (downsample): Sequential( (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) (2): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) (3): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) (4): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) (5): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) ) (layer4): Sequential( (0): Bottleneck( (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) (downsample): Sequential( (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) ) ) (1): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) (2): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (relu): ReLU(inplace=True) ) ) (avgpool): AdaptiveAvgPool2d(output_size=(1, 1)) (fc): Linear(in_features=2048, out_features=200, bias=True) )

IOU 50 70 are much lower than your report, do I missed something?? But for vgg16, there're ok. Thanks.

coallaoh commented 4 years ago

Could you summarise (1) what you did (2) what you expected and (3) what you got in shorter prose? Your report is too lengthy to read and understand ;)

junsukchoe commented 4 years ago

Thanks for your interest in our work!

I just noticed that you input wrong large_feature_map. According to our sheet, it should be True. I hope this helps you :)

TyroneLi commented 4 years ago

Thanks for your interest in our work!

I just noticed that you input wrong large_feature_map. According to our sheet, it should be True. I hope this helps you :)

Oh very thanks for your careful observation. I would try using larger feature map (28x28). By the way, I notice that you didn't implement Location accuracy both cls and location are correct(but only the location over IoU>threshold counts), which is commomly used by weakly supervised location or detection. Right?

re-train resnet50 cam method: CUDA_VISIBLE_DEVICES=6 python train.py --dataset_name CUB --architecture resnet50 --wsol_method cam --experiment_name CUB_CAM_resnet50_28x28_box_v2_metric --pretrained TRUE --large_feature_map True --batch_size 32 --epochs 50 --lr 0.0002 --lr_decay_frequency 15 --weight_decay 0.0001 --override_cache TRUE --workers 4 --box_v2_metric True --iou_threshold_list 30 50 70 --eval_checkpoint_type last --data_root /data/lijinlong/datasets/CUB-200-2011/

Computing and evaluating cams. Split train, metric loss, current value: 0.2727359641700098 Split train, metric loss, best value: 0.2681002541406973 Split train, metric loss, best epoch: 48 Split train, metric classification, current value: 97.81448114781448 Split train, metric classification, best value: 97.9813146479813 Split train, metric classification, best epoch: 49 Split val, metric classification, current value: 73.9 Split val, metric classification, best value: 74.2 Split val, metric classification, best epoch: 25 Split val, metric localization, current value: 50.43333333333334 Split val, metric localization, best value: 58.0 Split val, metric localization, best epoch: 3 Split val, metric localization_IOU_30, current value: 91.3 Split val, metric localization_IOU_30, best value: 95.5 Split val, metric localization_IOU_30, best epoch: 4 Split val, metric localization_IOU_50, current value: 50.5 Split val, metric localization_IOU_50, best value: 63.9 Split val, metric localization_IOU_50, best epoch: 3 Split val, metric localization_IOU_70, current value: 9.5 Split val, metric localization_IOU_70, best value: 16.9 Split val, metric localization_IOU_70, best epoch: 2 Split test, metric classification, current value: 78.25336555056955 Split test, metric localization, current value: 57.65159360257738 Split test, metric localization_IOU_30, current value: 96.78978253365551 Split test, metric localization_IOU_50, current value: 63.77286848463928 Split test, metric localization_IOU_70, current value: 12.392129789437348

It seems there's still a gap between my reimple results and your sheet report among IoU 50 and 70 for resnet50 vanilla cam.

junsukchoe commented 4 years ago

Could you please try again using this script?

python main.py \
    --dataset_name CUB \
    --architecture resnet50 \
    --wsol_method cam \
    --experiment_name CUB_CAM_resnet50_28x28_box_v2_metric \
    --pretrained TRUE \
    --large_feature_map True \
    --batch_size 32 \
    --epochs 50 \
    --lr 0.00023222617 \
    --lr_decay_frequency 15 \
    --weight_decay 0.0001 \
    --override_cache TRUE \
    --workers 4 \ 
    --box_v2_metric True \
    --iou_threshold_list 30 50 70 \
    --eval_checkpoint_type last
TyroneLi commented 4 years ago

Could you please try again using this script?

python main.py \
    --dataset_name CUB \
    --architecture resnet50 \
    --wsol_method cam \
    --experiment_name CUB_CAM_resnet50_28x28_box_v2_metric \
    --pretrained TRUE \
    --large_feature_map True \
    --batch_size 32 \
    --epochs 50 \
    --lr 0.00023222617 \
    --lr_decay_frequency 15 \
    --weight_decay 0.0001 \
    --override_cache TRUE \
    --workers 4 \ 
    --box_v2_metric True \
    --iou_threshold_list 30 50 70 \
    --eval_checkpoint_type last

Could you please try again using this script?

python main.py \
    --dataset_name CUB \
    --architecture resnet50 \
    --wsol_method cam \
    --experiment_name CUB_CAM_resnet50_28x28_box_v2_metric \
    --pretrained TRUE \
    --large_feature_map True \
    --batch_size 32 \
    --epochs 50 \
    --lr 0.00023222617 \
    --lr_decay_frequency 15 \
    --weight_decay 0.0001 \
    --override_cache TRUE \
    --workers 4 \ 
    --box_v2_metric True \
    --iou_threshold_list 30 50 70 \
    --eval_checkpoint_type last

Oh, I found that you made resnet50 layer4 stage stride equal 1, but I used official resnet50 with layer4 stage equals 2.And I change this and every thing is ok. Very thanks. By the way, I cannot see normal location accuracy metric in your code, which is counted when classification and location(IoU>threshold) are both correct but only IoU>threshold location accuracy inside your code. Right?

junsukchoe commented 4 years ago

Regarding the evaluation metric, we are aware that most existing works primarily evaluate their methods with the Top-1 Loc metric. However, we believe that Top-1 Loc is insufficient to make claims about improved localization. A technique can potentially see an improvement in Top-1 Loc just by having significantly better classification (even with worse localization!). We believe that any localization claims would have to be validated by more explicit localization-centric metrics. In this regard, PxAP with mask annotations is the ideal metric for evaluating WSOL. However, in many cases, only the box annotations are available. In this case, we suggest using MaxBoxAcc. Please see Sec. 3 and 4 for more detail.

Please also see #4 and #14 for more information.

coallaoh commented 4 years ago

Closing the issue, assuming the question was answered :) Please re-open the issue as necessary.