YudeWang / SEAM

Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation, CVPR 2020 (Oral)
MIT License
539 stars 97 forks source link

Large performance gap between trained model using default setting and the provided trained model. #13

Closed bityangke closed 4 years ago

bityangke commented 4 years ago

With the provided trained 'resnet38_SEAM.pth', the results of SEAM step evaluation:

0/60 background score: 0.000 mIoU: 28.861% 1/60 background score: 0.010 mIoU: 32.021% 2/60 background score: 0.020 mIoU: 35.937% 3/60 background score: 0.030 mIoU: 39.372% 4/60 background score: 0.040 mIoU: 42.470% 5/60 background score: 0.050 mIoU: 45.309% 6/60 background score: 0.060 mIoU: 47.967% 7/60 background score: 0.070 mIoU: 50.436% 8/60 background score: 0.080 mIoU: 52.721% 9/60 background score: 0.090 mIoU: 54.865% 10/60 background score: 0.100 mIoU: 56.885% 11/60 background score: 0.110 mIoU: 58.777% 12/60 background score: 0.120 mIoU: 60.595% 13/60 background score: 0.130 mIoU: 62.310% 14/60 background score: 0.140 mIoU: 63.905% 15/60 background score: 0.150 mIoU: 65.372% 16/60 background score: 0.160 mIoU: 66.710% 17/60 background score: 0.170 mIoU: 67.907% 18/60 background score: 0.180 mIoU: 68.925% 19/60 background score: 0.190 mIoU: 69.758% 20/60 background score: 0.200 mIoU: 70.414% 21/60 background score: 0.210 mIoU: 71.014% 22/60 background score: 0.220 mIoU: 71.291% 23/60 background score: 0.230 mIoU: 71.324% 24/60 background score: 0.240 mIoU: 71.143% 25/60 background score: 0.250 mIoU: 70.799% 26/60 background score: 0.260 mIoU: 70.287% 27/60 background score: 0.270 mIoU: 69.664% 28/60 background score: 0.280 mIoU: 68.952% 29/60 background score: 0.290 mIoU: 68.148% 30/60 background score: 0.300 mIoU: 67.274% 31/60 background score: 0.310 mIoU: 66.322% 32/60 background score: 0.320 mIoU: 65.305% 33/60 background score: 0.330 mIoU: 64.232% 34/60 background score: 0.340 mIoU: 63.105% 35/60 background score: 0.350 mIoU: 61.939% 36/60 background score: 0.360 mIoU: 60.727% 37/60 background score: 0.370 mIoU: 59.485% 38/60 background score: 0.380 mIoU: 58.215% 39/60 background score: 0.390 mIoU: 56.921% 40/60 background score: 0.400 mIoU: 55.609% 41/60 background score: 0.410 mIoU: 54.281% 42/60 background score: 0.420 mIoU: 52.940% 43/60 background score: 0.430 mIoU: 51.605% 44/60 background score: 0.440 mIoU: 50.279% 45/60 background score: 0.450 mIoU: 48.955% 46/60 background score: 0.460 mIoU: 47.630% 47/60 background score: 0.470 mIoU: 46.303% 48/60 background score: 0.480 mIoU: 44.982% 49/60 background score: 0.490 mIoU: 43.653% 50/60 background score: 0.500 mIoU: 42.330% 51/60 background score: 0.510 mIoU: 41.015% 52/60 background score: 0.520 mIoU: 39.709% 53/60 background score: 0.530 mIoU: 38.409% 54/60 background score: 0.540 mIoU: 37.119% 55/60 background score: 0.550 mIoU: 35.848% 56/60 background score: 0.560 mIoU: 34.601% 57/60 background score: 0.570 mIoU: 33.372% 58/60 background score: 0.580 mIoU: 32.158% 59/60 background score: 0.590 mIoU: 30.959%

When using the 'resnet38_SEAM.pth' trained myself using the default settings (except that I used two GPU cards,the batch size was still set to 8), the results of SEAM step evaluation:

0/60 background score: 0.000 mIoU: 22.938% 1/60 background score: 0.010 mIoU: 26.294% 2/60 background score: 0.020 mIoU: 30.367% 3/60 background score: 0.030 mIoU: 33.779% 4/60 background score: 0.040 mIoU: 36.815% 5/60 background score: 0.050 mIoU: 39.461% 6/60 background score: 0.060 mIoU: 41.722% 7/60 background score: 0.070 mIoU: 43.691% 8/60 background score: 0.080 mIoU: 45.386% 9/60 background score: 0.090 mIoU: 46.875% 10/60 background score: 0.100 mIoU: 48.230% 11/60 background score: 0.110 mIoU: 49.466% 12/60 background score: 0.120 mIoU: 50.592% 13/60 background score: 0.130 mIoU: 51.575% 14/60 background score: 0.140 mIoU: 52.443% 15/60 background score: 0.150 mIoU: 53.182% 16/60 background score: 0.160 mIoU: 53.806% 17/60 background score: 0.170 mIoU: 54.334% 18/60 background score: 0.180 mIoU: 54.759% 19/60 background score: 0.190 mIoU: 55.087% 20/60 background score: 0.200 mIoU: 55.339% 21/60 background score: 0.210 mIoU: 55.510% 22/60 background score: 0.220 mIoU: 55.590% 23/60 background score: 0.230 mIoU: 55.594% 24/60 background score: 0.240 mIoU: 55.525% 25/60 background score: 0.250 mIoU: 55.382% 26/60 background score: 0.260 mIoU: 55.169% 27/60 background score: 0.270 mIoU: 54.892% 28/60 background score: 0.280 mIoU: 54.556% 29/60 background score: 0.290 mIoU: 54.155% 30/60 background score: 0.300 mIoU: 53.685% 31/60 background score: 0.310 mIoU: 53.182% 32/60 background score: 0.320 mIoU: 52.640% 33/60 background score: 0.330 mIoU: 52.064% 34/60 background score: 0.340 mIoU: 51.445% 35/60 background score: 0.350 mIoU: 50.793% 36/60 background score: 0.360 mIoU: 50.107% 37/60 background score: 0.370 mIoU: 49.380% 38/60 background score: 0.380 mIoU: 48.624% 39/60 background score: 0.390 mIoU: 47.837% 40/60 background score: 0.400 mIoU: 47.029% 41/60 background score: 0.410 mIoU: 46.199% 42/60 background score: 0.420 mIoU: 45.353% 43/60 background score: 0.430 mIoU: 44.483% 44/60 background score: 0.440 mIoU: 43.593% 45/60 background score: 0.450 mIoU: 42.681% 46/60 background score: 0.460 mIoU: 41.749% 47/60 background score: 0.470 mIoU: 40.809% 48/60 background score: 0.480 mIoU: 39.855% 49/60 background score: 0.490 mIoU: 38.890% 50/60 background score: 0.500 mIoU: 37.914% 51/60 background score: 0.510 mIoU: 36.934% 52/60 background score: 0.520 mIoU: 35.954% 53/60 background score: 0.530 mIoU: 34.974% 54/60 background score: 0.540 mIoU: 33.988% 55/60 background score: 0.550 mIoU: 32.998% 56/60 background score: 0.560 mIoU: 32.011% 57/60 background score: 0.570 mIoU: 31.033% 58/60 background score: 0.580 mIoU: 30.064% 59/60 background score: 0.590 mIoU: 29.102%

bityangke commented 4 years ago

Did anyone encounter the same situation or have ideas about the situation?

halbielee commented 4 years ago

which dataset (among train/train_aug/val) do you use? is it train? and npy or png?

I think the performance of your custom trained version is similar to that of my execution (pretrained / custom both)

The performance of your pretrained implementation is too high.

bityangke commented 4 years ago

@halbielee Hi, I use the default training setting of this repo, that is, using voc12/train_aug.txt to train the model and evaluating on VOC2012/ImageSets/Segmentation/train.txt. The pretrained implementation I used is the trained model provided by the author @YudeWang.

halbielee commented 4 years ago

@bityangke I also follow the same setting and I get different result of yours. Hmm, which pretrained model do you use on Google Drive? or Baidu? I used the pretrained model on Google Drive and I got similar result of your custom learning.

Can you share the exact pretrained weight file [resnet38_SEAM.pth]? I will execute with it and let you know the result.

bityangke commented 4 years ago

@halbielee Hi I use the pre-trained model on Google Drive

halbielee commented 4 years ago

@bityangke I got the result from the pretrained model!

0/60 background score: 0.000 mIoU: 23.150% 1/60 background score: 0.010 mIoU: 25.909% 2/60 background score: 0.020 mIoU: 29.300% 3/60 background score: 0.030 mIoU: 32.255% 4/60 background score: 0.040 mIoU: 34.881% 5/60 background score: 0.050 mIoU: 37.225% 6/60 background score: 0.060 mIoU: 39.358% 7/60 background score: 0.070 mIoU: 41.260% 8/60 background score: 0.080 mIoU: 42.931% 9/60 background score: 0.090 mIoU: 44.451% 10/60 background score: 0.100 mIoU: 45.833% 11/60 background score: 0.110 mIoU: 47.077% 12/60 background score: 0.120 mIoU: 48.244% 13/60 background score: 0.130 mIoU: 49.304% 14/60 background score: 0.140 mIoU: 50.253% 15/60 background score: 0.150 mIoU: 51.123% 16/60 background score: 0.160 mIoU: 51.901% 17/60 background score: 0.170 mIoU: 52.578% 18/60 background score: 0.180 mIoU: 53.139% 19/60 background score: 0.190 mIoU: 53.637% 20/60 background score: 0.200 mIoU: 54.066% 21/60 background score: 0.210 mIoU: 54.565% 22/60 background score: 0.220 mIoU: 54.890% 23/60 background score: 0.230 mIoU: 55.141% 24/60 background score: 0.240 mIoU: 55.310% 25/60 background score: 0.250 mIoU: 55.391% 26/60 background score: 0.260 mIoU: 55.406% 27/60 background score: 0.270 mIoU: 55.346% 28/60 background score: 0.280 mIoU: 55.218% 29/60 background score: 0.290 mIoU: 55.034% 30/60 background score: 0.300 mIoU: 54.785% 31/60 background score: 0.310 mIoU: 54.476% 32/60 background score: 0.320 mIoU: 54.109% 33/60 background score: 0.330 mIoU: 53.684% 34/60 background score: 0.340 mIoU: 53.216% 35/60 background score: 0.350 mIoU: 52.717% 36/60 background score: 0.360 mIoU: 52.175% 37/60 background score: 0.370 mIoU: 51.596% 38/60 background score: 0.380 mIoU: 50.993% 39/60 background score: 0.390 mIoU: 50.355% 40/60 background score: 0.400 mIoU: 49.673% 41/60 background score: 0.410 mIoU: 48.949% 42/60 background score: 0.420 mIoU: 48.186% 43/60 background score: 0.430 mIoU: 47.392% 44/60 background score: 0.440 mIoU: 46.578% 45/60 background score: 0.450 mIoU: 45.736% 46/60 background score: 0.460 mIoU: 44.870% 47/60 background score: 0.470 mIoU: 43.983% 48/60 background score: 0.480 mIoU: 43.076% 49/60 background score: 0.490 mIoU: 42.141% 50/60 background score: 0.500 mIoU: 41.182% 51/60 background score: 0.510 mIoU: 40.207% 52/60 background score: 0.520 mIoU: 39.217% 53/60 background score: 0.530 mIoU: 38.211% 54/60 background score: 0.540 mIoU: 37.193% 55/60 background score: 0.550 mIoU: 36.172% 56/60 background score: 0.560 mIoU: 35.153% 57/60 background score: 0.570 mIoU: 34.131% 58/60 background score: 0.580 mIoU: 33.108% 59/60 background score: 0.590 mIoU: 32.082%

I got the same result and the number 55.406 % is the number on the paper Table 1.

bityangke commented 4 years ago

@halbielee My bad. I need the CAM generated by PSA for a downstream task, so I put the PSA CAM in VOC2012/SegmentationClass/ to replace the original GT images. I evaluated the custom model on another machine with original GT data, so the result was not affected; when I changed to the correct GT, the result of pretrained model was exactly the same as your result.

hchoi71 commented 3 years ago

is the number on the paper Ta

Is pretrained model that you used referring to 'ilsvrc-cls_rna-a1_cls1000_ep-0001.params' or 'resnet38_SEAM.pth'?