Try to reproduce voc 10-1 results

Ze-Yang commented 2 years ago

I am trying to reproduce 10-1 results as shown in the table below. I notice a large gap of the old class mIoU between my reproduce result (38.82) and your reported one (44.03), roughly 5 percent points. I am wondering what will cause this problem. I run the experiments with 2 x RTX 3090 GPU. I follow your original implementation except for the cuda version. I am using cuda 11.3 because cuda 10.2 does not support RTX 3090. Does it matter?

Btw, may I know what GPU model do you use? I think it requires to have at least 16G to hold a batch of 12 on each device and needs to support cuda 10.2 as well. V100? I guess.

Meanwhile, I notice a weird phenomenon that background performance drops drastically starting from the 8-th step and becomes 0 at 9-th step. I think this harms the old class performance a lot. Do you have a similar issue?

Thanks.

step	background	aeroplane	bicycle	bird	boat	bottle	bus	car	cat	chair	cow	diningtable	dog	horse	motorbike	person	pottedplant	sheep	sofa	train	tvmonitor	0-10	11-20	all
0	95.19%	89.84%	41.01%	89.74%	72.24%	85.02%	95.29%	88.26%	93.30%	43.15%	92.25%	x	x	x	x	x	x	x	x	x	x	80.48%	-	80.48%
1	92.01%	87.25%	35.76%	82.88%	67.51%	79.44%	94.49%	82.55%	88.44%	46.25%	90.74%	37.38%	x	x	x	x	x	x	x	x	x	77.03%	37.38%	73.73%
2	89.16%	84.00%	34.32%	83.28%	63.44%	68.45%	92.90%	82.54%	82.37%	31.68%	84.48%	24.68%	62.59%	x	x	x	x	x	x	x	x	72.42%	43.64%	67.99%
3	83.97%	79.07%	31.92%	73.50%	53.19%	51.06%	91.18%	80.90%	76.82%	17.40%	65.26%	18.69%	3.93%	22.11%	x	x	x	x	x	x	x	64.02%	14.91%	53.50%
4	79.89%	77.61%	35.83%	74.77%	47.67%	49.70%	90.73%	78.02%	79.41%	13.21%	65.81%	17.77%	9.43%	20.52%	33.81%	x	x	x	x	x	x	62.97%	20.38%	51.61%
5	84.56%	84.34%	37.88%	81.72%	53.81%	56.75%	90.72%	82.48%	82.98%	13.77%	69.20%	0.57%	0.00%	26.10%	58.37%	69.71%	x	x	x	x	x	67.11%	30.95%	55.81%
6	82.18%	80.07%	37.93%	69.65%	49.38%	55.32%	89.74%	81.63%	77.15%	14.10%	61.90%	0.48%	0.00%	23.42%	51.28%	64.77%	5.04%	x	x	x	x	63.55%	24.16%	49.65%
7	80.42%	77.61%	36.88%	50.15%	42.00%	53.55%	75.10%	80.31%	74.55%	10.30%	20.94%	0.05%	0.00%	24.42%	50.90%	57.63%	0.00%	19.33%	x	x	x	54.71%	21.76%	41.90%
8	44.45%	71.53%	35.75%	51.99%	44.17%	50.20%	80.98%	76.45%	69.19%	29.32%	40.46%	0.18%	0.00%	22.80%	51.03%	71.25%	0.00%	15.88%	3.26%	x	x	54.04%	20.55%	39.94%
9	0.55%	61.80%	35.53%	51.15%	44.62%	50.52%	69.98%	77.75%	57.82%	9.92%	16.76%	0.05%	0.00%	23.56%	46.89%	62.95%	0.00%	7.37%	1.49%	2.27%	x	43.31%	16.06%	31.05%
10	0.00%	50.48%	32.40%	38.14%	40.64%	51.90%	62.86%	69.76%	56.53%	17.47%	6.83%	0.02%	0.00%	23.25%	53.95%	66.92%	0.00%	1.73%	1.67%	0.04%	2.60%	38.82%	15.02%	27.49%

arthurdouillard commented 2 years ago

I was using 2 V100 GPUs.

Do you also use the same batch size as me?

I know that the mixed precision could something give different results depending on the GPU/CUDA. Have you tried without it?

Do you have a similar issue?

Unfortunately I have left my lab as I have finished my lab, and I don't have anymore those intermediary results.

Ze-Yang commented 2 years ago

I have tried it on 2x RTX 2080Ti with cuda 10.2 (see environment.txt for the full environment). However, I got almost similar results as running with 2x RTX 3090. I think the cuda version and pytorch version does not affect the results that much. It still exists some gap (4.5 percent point) in the old class performance. Looking forward to your advice. Thanks.

Do you also use the same batch size as me?

Yes, I run with the default batch size 24, each GPU with 12.

step	background	aeroplane	bicycle	bird	boat	bottle	bus	car	cat	chair	cow	diningtable	dog	horse	motorbike	person	pottedplant	sheep	sofa	train	tvmonitor	0-10	11-20	all
0	95.20%	89.52%	40.69%	89.28%	71.95%	84.31%	95.54%	88.19%	92.76%	46.13%	91.65%	x	x	x	x	x	x	x	x	x	x	80.48%		80.48%
1	92.01%	85.67%	36.34%	84.94%	69.18%	78.48%	94.33%	82.29%	87.31%	48.07%	90.68%	37.13%	x	x	x	x	x	x	x	x	x	77.21%	37.13%	73.87%
2	88.68%	81.91%	35.53%	83.66%	63.85%	66.64%	93.23%	79.39%	81.74%	30.81%	85.40%	28.75%	61.16%	x	x	x	x	x	x	x	x	71.90%	44.96%	67.75%
3	83.42%	81.38%	32.47%	77.93%	55.47%	54.16%	91.72%	79.14%	73.56%	15.42%	67.04%	20.92%	13.42%	23.46%	x	x	x	x	x	x	x	64.70%	19.26%	54.96%
4	77.20%	73.92%	35.52%	72.78%	46.47%	52.34%	89.25%	77.28%	75.36%	10.86%	71.35%	17.17%	15.28%	21.92%	32.24%	x	x	x	x	x	x	62.03%	21.65%	51.26%
5	82.93%	85.78%	38.60%	78.68%	54.56%	58.62%	89.83%	81.99%	81.21%	11.34%	72.56%	0.38%	0.00%	29.06%	56.57%	69.71%	x	x	x	x	x	66.92%	31.14%	55.74%
6	80.73%	79.88%	37.65%	67.48%	50.95%	55.54%	90.07%	81.10%	78.17%	11.73%	65.35%	0.70%	0.00%	25.87%	47.94%	64.26%	5.08%	x	x	x	x	63.51%	23.98%	49.56%
7	76.70%	78.55%	37.95%	60.99%	40.11%	54.74%	71.47%	79.51%	73.48%	8.12%	36.72%	0.06%	0.00%	26.92%	41.19%	56.38%	0.00%	20.49%	x	x	x	56.21%	20.72%	42.41%
8	27.93%	71.21%	35.10%	49.39%	41.93%	52.80%	76.69%	76.92%	71.46%	23.04%	38.82%	0.08%	0.00%	25.62%	47.53%	70.07%	0.00%	0.00%	2.51%	x	x	51.39%	18.22%	37.43%
9	0.00%	65.28%	34.59%	52.03%	41.87%	54.80%	68.75%	76.32%	62.90%	6.32%	30.98%	0.01%	0.00%	25.35%	51.62%	59.17%	0.00%	0.00%	1.46%	6.82%	x	44.89%	16.05%	31.91%
10	0.00%	59.74%	23.64%	41.97%	37.19%	56.25%	58.29%	74.52%	59.88%	11.13%	11.89%	0.02%	0.00%	24.02%	55.30%	64.20%	0.00%	0.00%	1.70%	0.00%	3.38%	39.50%	14.86%	27.77%

Ze-Yang commented 2 years ago

@arthurdouillard May I know do you use different hyperparameter settings for different tasks, e.g., 10-1, 15-5, 15-1, etc.? Because I can reproduce the results for 15-1.

arthurdouillard / CVPR2021_PLOP

Try to reproduce voc 10-1 results #37