Large performance gap between trained model using default setting and the provided trained model.

bityangke commented 4 years ago

With the provided trained 'resnet38_SEAM.pth', the results of SEAM step evaluation:

0/60 background score: 0.000 mIoU: 28.861% 1/60 background score: 0.010 mIoU: 32.021% 2/60 background score: 0.020 mIoU: 35.937% 3/60 background score: 0.030 mIoU: 39.372% 4/60 background score: 0.040 mIoU: 42.470% 5/60 background score: 0.050 mIoU: 45.309% 6/60 background score: 0.060 mIoU: 47.967% 7/60 background score: 0.070 mIoU: 50.436% 8/60 background score: 0.080 mIoU: 52.721% 9/60 background score: 0.090 mIoU: 54.865% 10/60 background score: 0.100 mIoU: 56.885% 11/60 background score: 0.110 mIoU: 58.777% 12/60 background score: 0.120 mIoU: 60.595% 13/60 background score: 0.130 mIoU: 62.310% 14/60 background score: 0.140 mIoU: 63.905% 15/60 background score: 0.150 mIoU: 65.372% 16/60 background score: 0.160 mIoU: 66.710% 17/60 background score: 0.170 mIoU: 67.907% 18/60 background score: 0.180 mIoU: 68.925% 19/60 background score: 0.190 mIoU: 69.758% 20/60 background score: 0.200 mIoU: 70.414% 21/60 background score: 0.210 mIoU: 71.014% 22/60 background score: 0.220 mIoU: 71.291% 23/60 background score: 0.230 mIoU: 71.324% 24/60 background score: 0.240 mIoU: 71.143% 25/60 background score: 0.250 mIoU: 70.799% 26/60 background score: 0.260 mIoU: 70.287% 27/60 background score: 0.270 mIoU: 69.664% 28/60 background score: 0.280 mIoU: 68.952% 29/60 background score: 0.290 mIoU: 68.148% 30/60 background score: 0.300 mIoU: 67.274% 31/60 background score: 0.310 mIoU: 66.322% 32/60 background score: 0.320 mIoU: 65.305% 33/60 background score: 0.330 mIoU: 64.232% 34/60 background score: 0.340 mIoU: 63.105% 35/60 background score: 0.350 mIoU: 61.939% 36/60 background score: 0.360 mIoU: 60.727% 37/60 background score: 0.370 mIoU: 59.485% 38/60 background score: 0.380 mIoU: 58.215% 39/60 background score: 0.390 mIoU: 56.921% 40/60 background score: 0.400 mIoU: 55.609% 41/60 background score: 0.410 mIoU: 54.281% 42/60 background score: 0.420 mIoU: 52.940% 43/60 background score: 0.430 mIoU: 51.605% 44/60 background score: 0.440 mIoU: 50.279% 45/60 background score: 0.450 mIoU: 48.955% 46/60 background score: 0.460 mIoU: 47.630% 47/60 background score: 0.470 mIoU: 46.303% 48/60 background score: 0.480 mIoU: 44.982% 49/60 background score: 0.490 mIoU: 43.653% 50/60 background score: 0.500 mIoU: 42.330% 51/60 background score: 0.510 mIoU: 41.015% 52/60 background score: 0.520 mIoU: 39.709% 53/60 background score: 0.530 mIoU: 38.409% 54/60 background score: 0.540 mIoU: 37.119% 55/60 background score: 0.550 mIoU: 35.848% 56/60 background score: 0.560 mIoU: 34.601% 57/60 background score: 0.570 mIoU: 33.372% 58/60 background score: 0.580 mIoU: 32.158% 59/60 background score: 0.590 mIoU: 30.959%

When using the 'resnet38_SEAM.pth' trained myself using the default settings (except that I used two GPU cards，the batch size was still set to 8), the results of SEAM step evaluation:

0/60 background score: 0.000 mIoU: 22.938% 1/60 background score: 0.010 mIoU: 26.294% 2/60 background score: 0.020 mIoU: 30.367% 3/60 background score: 0.030 mIoU: 33.779% 4/60 background score: 0.040 mIoU: 36.815% 5/60 background score: 0.050 mIoU: 39.461% 6/60 background score: 0.060 mIoU: 41.722% 7/60 background score: 0.070 mIoU: 43.691% 8/60 background score: 0.080 mIoU: 45.386% 9/60 background score: 0.090 mIoU: 46.875% 10/60 background score: 0.100 mIoU: 48.230% 11/60 background score: 0.110 mIoU: 49.466% 12/60 background score: 0.120 mIoU: 50.592% 13/60 background score: 0.130 mIoU: 51.575% 14/60 background score: 0.140 mIoU: 52.443% 15/60 background score: 0.150 mIoU: 53.182% 16/60 background score: 0.160 mIoU: 53.806% 17/60 background score: 0.170 mIoU: 54.334% 18/60 background score: 0.180 mIoU: 54.759% 19/60 background score: 0.190 mIoU: 55.087% 20/60 background score: 0.200 mIoU: 55.339% 21/60 background score: 0.210 mIoU: 55.510% 22/60 background score: 0.220 mIoU: 55.590% 23/60 background score: 0.230 mIoU: 55.594% 24/60 background score: 0.240 mIoU: 55.525% 25/60 background score: 0.250 mIoU: 55.382% 26/60 background score: 0.260 mIoU: 55.169% 27/60 background score: 0.270 mIoU: 54.892% 28/60 background score: 0.280 mIoU: 54.556% 29/60 background score: 0.290 mIoU: 54.155% 30/60 background score: 0.300 mIoU: 53.685% 31/60 background score: 0.310 mIoU: 53.182% 32/60 background score: 0.320 mIoU: 52.640% 33/60 background score: 0.330 mIoU: 52.064% 34/60 background score: 0.340 mIoU: 51.445% 35/60 background score: 0.350 mIoU: 50.793% 36/60 background score: 0.360 mIoU: 50.107% 37/60 background score: 0.370 mIoU: 49.380% 38/60 background score: 0.380 mIoU: 48.624% 39/60 background score: 0.390 mIoU: 47.837% 40/60 background score: 0.400 mIoU: 47.029% 41/60 background score: 0.410 mIoU: 46.199% 42/60 background score: 0.420 mIoU: 45.353% 43/60 background score: 0.430 mIoU: 44.483% 44/60 background score: 0.440 mIoU: 43.593% 45/60 background score: 0.450 mIoU: 42.681% 46/60 background score: 0.460 mIoU: 41.749% 47/60 background score: 0.470 mIoU: 40.809% 48/60 background score: 0.480 mIoU: 39.855% 49/60 background score: 0.490 mIoU: 38.890% 50/60 background score: 0.500 mIoU: 37.914% 51/60 background score: 0.510 mIoU: 36.934% 52/60 background score: 0.520 mIoU: 35.954% 53/60 background score: 0.530 mIoU: 34.974% 54/60 background score: 0.540 mIoU: 33.988% 55/60 background score: 0.550 mIoU: 32.998% 56/60 background score: 0.560 mIoU: 32.011% 57/60 background score: 0.570 mIoU: 31.033% 58/60 background score: 0.580 mIoU: 30.064% 59/60 background score: 0.590 mIoU: 29.102%

bityangke commented 4 years ago

Did anyone encounter the same situation or have ideas about the situation?

halbielee commented 4 years ago

which dataset (among train/train_aug/val) do you use? is it train? and npy or png?

I think the performance of your custom trained version is similar to that of my execution (pretrained / custom both)

The performance of your pretrained implementation is too high.

bityangke commented 4 years ago

@halbielee Hi, I use the default training setting of this repo, that is, using voc12/train_aug.txt to train the model and evaluating on VOC2012/ImageSets/Segmentation/train.txt. The pretrained implementation I used is the trained model provided by the author @YudeWang.

halbielee commented 4 years ago

@bityangke I also follow the same setting and I get different result of yours. Hmm, which pretrained model do you use on Google Drive? or Baidu? I used the pretrained model on Google Drive and I got similar result of your custom learning.

Can you share the exact pretrained weight file [resnet38_SEAM.pth]? I will execute with it and let you know the result.

bityangke commented 4 years ago

@halbielee Hi I use the pre-trained model on Google Drive

halbielee commented 4 years ago

@bityangke I got the result from the pretrained model!

0/60 background score: 0.000 mIoU: 23.150% 1/60 background score: 0.010 mIoU: 25.909% 2/60 background score: 0.020 mIoU: 29.300% 3/60 background score: 0.030 mIoU: 32.255% 4/60 background score: 0.040 mIoU: 34.881% 5/60 background score: 0.050 mIoU: 37.225% 6/60 background score: 0.060 mIoU: 39.358% 7/60 background score: 0.070 mIoU: 41.260% 8/60 background score: 0.080 mIoU: 42.931% 9/60 background score: 0.090 mIoU: 44.451% 10/60 background score: 0.100 mIoU: 45.833% 11/60 background score: 0.110 mIoU: 47.077% 12/60 background score: 0.120 mIoU: 48.244% 13/60 background score: 0.130 mIoU: 49.304% 14/60 background score: 0.140 mIoU: 50.253% 15/60 background score: 0.150 mIoU: 51.123% 16/60 background score: 0.160 mIoU: 51.901% 17/60 background score: 0.170 mIoU: 52.578% 18/60 background score: 0.180 mIoU: 53.139% 19/60 background score: 0.190 mIoU: 53.637% 20/60 background score: 0.200 mIoU: 54.066% 21/60 background score: 0.210 mIoU: 54.565% 22/60 background score: 0.220 mIoU: 54.890% 23/60 background score: 0.230 mIoU: 55.141% 24/60 background score: 0.240 mIoU: 55.310% 25/60 background score: 0.250 mIoU: 55.391% 26/60 background score: 0.260 mIoU: 55.406% 27/60 background score: 0.270 mIoU: 55.346% 28/60 background score: 0.280 mIoU: 55.218% 29/60 background score: 0.290 mIoU: 55.034% 30/60 background score: 0.300 mIoU: 54.785% 31/60 background score: 0.310 mIoU: 54.476% 32/60 background score: 0.320 mIoU: 54.109% 33/60 background score: 0.330 mIoU: 53.684% 34/60 background score: 0.340 mIoU: 53.216% 35/60 background score: 0.350 mIoU: 52.717% 36/60 background score: 0.360 mIoU: 52.175% 37/60 background score: 0.370 mIoU: 51.596% 38/60 background score: 0.380 mIoU: 50.993% 39/60 background score: 0.390 mIoU: 50.355% 40/60 background score: 0.400 mIoU: 49.673% 41/60 background score: 0.410 mIoU: 48.949% 42/60 background score: 0.420 mIoU: 48.186% 43/60 background score: 0.430 mIoU: 47.392% 44/60 background score: 0.440 mIoU: 46.578% 45/60 background score: 0.450 mIoU: 45.736% 46/60 background score: 0.460 mIoU: 44.870% 47/60 background score: 0.470 mIoU: 43.983% 48/60 background score: 0.480 mIoU: 43.076% 49/60 background score: 0.490 mIoU: 42.141% 50/60 background score: 0.500 mIoU: 41.182% 51/60 background score: 0.510 mIoU: 40.207% 52/60 background score: 0.520 mIoU: 39.217% 53/60 background score: 0.530 mIoU: 38.211% 54/60 background score: 0.540 mIoU: 37.193% 55/60 background score: 0.550 mIoU: 36.172% 56/60 background score: 0.560 mIoU: 35.153% 57/60 background score: 0.570 mIoU: 34.131% 58/60 background score: 0.580 mIoU: 33.108% 59/60 background score: 0.590 mIoU: 32.082%

I got the same result and the number 55.406 % is the number on the paper Table 1.

bityangke commented 4 years ago

@halbielee My bad. I need the CAM generated by PSA for a downstream task, so I put the PSA CAM in VOC2012/SegmentationClass/ to replace the original GT images. I evaluated the custom model on another machine with original GT data, so the result was not affected; when I changed to the correct GT, the result of pretrained model was exactly the same as your result.

hchoi71 commented 3 years ago

is the number on the paper Ta

Is pretrained model that you used referring to 'ilsvrc-cls_rna-a1_cls1000_ep-0001.params' or 'resnet38_SEAM.pth'?

YudeWang / SEAM

Large performance gap between trained model using default setting and the provided trained model. #13