OpenGVLab / DCNv4

[CVPR 2024] Deformable Convolution v4
https://arxiv.org/pdf/2401.06197.pdf
MIT License
482 stars 27 forks source link

why training speed is slower than DCNv3 #54

Open zdk258 opened 3 months ago

zdk258 commented 3 months ago

DCNv3.txt DCNv4.txt

Thanks for your great work! DCNv4's GPU memory is smaller than DCNv3, but training speed is lower than DCNv3.

training logs:

DCNv3:
> 2024-06-03 16:49:36,426 - mmseg - INFO - Iter(val) [363]  aAcc: 0.9086, mIoU: 0.5908, mAcc: 0.7245, IoU.background: 0.9170, IoU.aeroplane: 0.8108, IoU.bicycle: 0.0003, IoU.bird: 0.8187, IoU.boat: 0.6584, IoU.bottle: 0.6627, IoU.bus: 0.7321, IoU.car: 0.7177, IoU.cat: 0.7774, IoU.chair: 0.1220, IoU.cow: 0.3787, IoU.diningtable: 0.4631, IoU.dog: 0.6879, IoU.horse: 0.5629, IoU.motorbike: 0.6305, IoU.person: 0.8057, IoU.pottedplant: 0.3744, IoU.sheep: 0.6976, IoU.sofa: 0.3830, IoU.train: 0.6973, IoU.tvmonitor: 0.5082, Acc.background: 0.9549, Acc.aeroplane: 0.9251, Acc.bicycle: 0.0003, Acc.bird: 0.9474, Acc.boat: 0.8029, Acc.bottle: 0.7956, Acc.bus: 0.7599, Acc.car: 0.8726, Acc.cat: 0.9606, Acc.chair: 0.1382, Acc.cow: 0.4239, Acc.diningtable: 0.7624, Acc.dog: 0.7852, Acc.horse: 0.8660, Acc.motorbike: 0.9197, Acc.person: 0.8853, Acc.pottedplant: 0.4635, Acc.sheep: 0.8879, Acc.sofa: 0.6531, Acc.train: 0.8360, Acc.tvmonitor: 0.5736
> 2024-06-03 16:50:32,046 - mmseg - INFO - Iter [1050/10000]    lr: 3.756e-05, eta: 3:05:25, time: 4.757, data_time: 3.722, memory: 8663, decode.loss_ce: 0.4578, decode.acc_seg: 88.3050, loss: 0.4578
> 2024-06-03 16:51:27,163 - mmseg - INFO - Iter [1100/10000]    lr: 3.913e-05, eta: 3:03:25, time: 1.101, data_time: 0.076, memory: 8663, decode.loss_ce: 0.4152, decode.acc_seg: 89.2161, loss: 0.4152
> 2024-06-03 16:52:16,599 - mmseg - INFO - Iter [1150/10000]    lr: 4.068e-05, eta: 3:00:48, time: 0.989, data_time: 0.023, memory: 8663, decode.loss_ce: 0.3748, decode.acc_seg: 90.1542, loss: 0.3748
> 2024-06-03 16:53:09,808 - mmseg - INFO - Iter [1200/10000]    lr: 4.221e-05, eta: 2:58:48, time: 1.064, data_time: 0.070, memory: 8663, decode.loss_ce: 0.3845, decode.acc_seg: 88.6825, loss: 0.3845
> 2024-06-03 16:54:02,259 - mmseg - INFO - Iter [1250/10000]    lr: 4.372e-05, eta: 2:56:48, time: 1.050, data_time: 0.025, memory: 8663, decode.loss_ce: 0.3242, decode.acc_seg: 90.5948, loss: 0.3242
> 2024-06-03 16:54:51,308 - mmseg - INFO - Iter [1300/10000]    lr: 4.521e-05, eta: 2:54:29, time: 0.980, data_time: 0.061, memory: 8663, decode.loss_ce: 0.3314, decode.acc_seg: 90.5931, loss: 0.3314
> 2024-06-03 16:55:44,803 - mmseg - INFO - Iter [1350/10000]    lr: 4.668e-05, eta: 2:52:45, time: 1.066, data_time: 0.022, memory: 8663, decode.loss_ce: 0.2965, decode.acc_seg: 91.7068, loss: 0.2965
> 2024-06-03 16:56:34,855 - mmseg - INFO - Iter [1400/10000]    lr: 4.813e-05, eta: 2:50:45, time: 1.003, data_time: 0.072, memory: 8663, decode.loss_ce: 0.2950, decode.acc_seg: 91.4018, loss: 0.2950
> 2024-06-03 16:57:18,536 - mmseg - INFO - Iter [1450/10000]    lr: 4.956e-05, eta: 2:48:12, time: 0.876, data_time: 0.019, memory: 8663, decode.loss_ce: 0.2735, decode.acc_seg: 91.9677, loss: 0.2735
> 2024-06-03 16:58:14,582 - mmseg - INFO - Iter [1500/10000]    lr: 5.097e-05, eta: 2:46:56, time: 1.120, data_time: 0.071, memory: 8663, decode.loss_ce: 0.2773, decode.acc_seg: 91.6729, loss: 0.2773
> 2024-06-03 16:59:09,946 - mmseg - INFO - Iter [1550/10000]    lr: 5.071e-05, eta: 2:45:38, time: 1.108, data_time: 0.072, memory: 8663, decode.loss_ce: 0.2390, decode.acc_seg: 92.6893, loss: 0.2390
> 2024-06-03 17:00:03,287 - mmseg - INFO - Iter [1600/10000]    lr: 5.041e-05, eta: 2:44:11, time: 1.066, data_time: 0.026, memory: 8663, decode.loss_ce: 0.2282, decode.acc_seg: 92.9937, loss: 0.2282
> 2024-06-03 17:00:57,435 - mmseg - INFO - Iter [1650/10000]    lr: 5.011e-05, eta: 2:42:49, time: 1.083, data_time: 0.076, memory: 8663, decode.loss_ce: 0.2353, decode.acc_seg: 92.7644, loss: 0.2353
> 2024-06-03 17:01:47,666 - mmseg - INFO - Iter [1700/10000]    lr: 4.981e-05, eta: 2:41:10, time: 1.005, data_time: 0.023, memory: 8663, decode.loss_ce: 0.2397, decode.acc_seg: 92.2654, loss: 0.2397
> 2024-06-03 17:02:42,326 - mmseg - INFO - Iter [1750/10000]    lr: 4.951e-05, eta: 2:39:55, time: 1.094, data_time: 0.072, memory: 8663, decode.loss_ce: 0.2177, decode.acc_seg: 93.1369, loss: 0.2177
> 2024-06-03 17:03:35,397 - mmseg - INFO - Iter [1800/10000]    lr: 4.921e-05, eta: 2:38:34, time: 1.061, data_time: 0.026, memory: 8663, decode.loss_ce: 0.2136, decode.acc_seg: 92.9683, loss: 0.2136
> 2024-06-03 17:04:29,093 - mmseg - INFO - Iter [1850/10000]    lr: 4.891e-05, eta: 2:37:17, time: 1.074, data_time: 0.069, memory: 8663, decode.loss_ce: 0.1908, decode.acc_seg: 94.1496, loss: 0.1908
> 2024-06-03 17:05:15,609 - mmseg - INFO - Iter [1900/10000]    lr: 4.861e-05, eta: 2:35:30, time: 0.930, data_time: 0.018, memory: 8663, decode.loss_ce: 0.1911, decode.acc_seg: 93.8499, loss: 0.1911
> 2024-06-03 17:06:06,187 - mmseg - INFO - Iter [1950/10000]    lr: 4.831e-05, eta: 2:34:04, time: 1.011, data_time: 0.065, memory: 8663, decode.loss_ce: 0.1897, decode.acc_seg: 93.8007, loss: 0.1897
DCNv4
2024-06-03 20:42:41,716 - mmseg - INFO - Iter(val) [363]    aAcc: 0.9354, mIoU: 0.7368, mAcc: 0.8772, IoU.background: 0.9283, IoU.aeroplane: 0.8169, IoU.bicycle: 0.4708, IoU.bird: 0.8664, IoU.boat: 0.7158, IoU.bottle: 0.6702, IoU.bus: 0.8876, IoU.car: 0.8332, IoU.cat: 0.9000, IoU.chair: 0.3353, IoU.cow: 0.8320, IoU.diningtable: 0.5789, IoU.dog: 0.8764, IoU.horse: 0.8228, IoU.motorbike: 0.8233, IoU.person: 0.8135, IoU.pottedplant: 0.4812, IoU.sheep: 0.8459, IoU.sofa: 0.5462, IoU.train: 0.8361, IoU.tvmonitor: 0.5913, Acc.background: 0.9469, Acc.aeroplane: 0.9469, Acc.bicycle: 0.7323, Acc.bird: 0.9528, Acc.boat: 0.9069, Acc.bottle: 0.9136, Acc.bus: 0.9823, Acc.car: 0.9325, Acc.cat: 0.9644, Acc.chair: 0.4444, Acc.cow: 0.9650, Acc.diningtable: 0.7784, Acc.dog: 0.9486, Acc.horse: 0.9096, Acc.motorbike: 0.9541, Acc.person: 0.9428, Acc.pottedplant: 0.6242, Acc.sheep: 0.9183, Acc.sofa: 0.8038, Acc.train: 0.9066, Acc.tvmonitor: 0.9477
2024-06-03 20:43:39,370 - mmseg - INFO - Iter [1050/10000]  lr: 3.756e-05, eta: 3:42:05, time: 3.998, data_time: 2.911, memory: 7199, decode.loss_ce: 0.3133, decode.acc_seg: 93.0568, loss: 0.3133
2024-06-03 20:44:38,501 - mmseg - INFO - Iter [1100/10000]  lr: 3.913e-05, eta: 3:38:46, time: 1.181, data_time: 0.064, memory: 7199, decode.loss_ce: 0.3005, decode.acc_seg: 93.1474, loss: 0.3005
2024-06-03 20:45:37,650 - mmseg - INFO - Iter [1150/10000]  lr: 4.068e-05, eta: 3:35:41, time: 1.185, data_time: 0.025, memory: 7199, decode.loss_ce: 0.2508, decode.acc_seg: 94.3911, loss: 0.2508
2024-06-03 20:46:42,845 - mmseg - INFO - Iter [1200/10000]  lr: 4.221e-05, eta: 3:33:29, time: 1.303, data_time: 0.073, memory: 7199, decode.loss_ce: 0.2463, decode.acc_seg: 94.0225, loss: 0.2463
2024-06-03 20:47:44,980 - mmseg - INFO - Iter [1250/10000]  lr: 4.372e-05, eta: 3:31:02, time: 1.243, data_time: 0.025, memory: 7199, decode.loss_ce: 0.2126, decode.acc_seg: 94.9353, loss: 0.2126
2024-06-03 20:48:48,852 - mmseg - INFO - Iter [1300/10000]  lr: 4.521e-05, eta: 3:28:52, time: 1.275, data_time: 0.074, memory: 7199, decode.loss_ce: 0.2152, decode.acc_seg: 94.4182, loss: 0.2152
2024-06-03 20:49:47,437 - mmseg - INFO - Iter [1350/10000]  lr: 4.668e-05, eta: 3:26:14, time: 1.173, data_time: 0.023, memory: 7199, decode.loss_ce: 0.1946, decode.acc_seg: 95.0412, loss: 0.1946
2024-06-03 20:50:48,074 - mmseg - INFO - Iter [1400/10000]  lr: 4.813e-05, eta: 3:23:56, time: 1.211, data_time: 0.067, memory: 7199, decode.loss_ce: 0.1913, decode.acc_seg: 94.9563, loss: 0.1913
2024-06-03 20:51:50,418 - mmseg - INFO - Iter [1450/10000]  lr: 4.956e-05, eta: 3:21:53, time: 1.248, data_time: 0.030, memory: 7199, decode.loss_ce: 0.1796, decode.acc_seg: 94.8814, loss: 0.1796
2024-06-03 20:52:52,312 - mmseg - INFO - Iter [1500/10000]  lr: 5.097e-05, eta: 3:19:52, time: 1.238, data_time: 0.079, memory: 7199, decode.loss_ce: 0.1682, decode.acc_seg: 95.3622, loss: 0.1682
2024-06-03 20:53:47,519 - mmseg - INFO - Iter [1550/10000]  lr: 5.071e-05, eta: 3:17:17, time: 1.104, data_time: 0.063, memory: 7199, decode.loss_ce: 0.1678, decode.acc_seg: 95.3572, loss: 0.1678
2024-06-03 20:54:46,438 - mmseg - INFO - Iter [1600/10000]  lr: 5.041e-05, eta: 3:15:09, time: 1.178, data_time: 0.021, memory: 7199, decode.loss_ce: 0.1428, decode.acc_seg: 95.9335, loss: 0.1428
2024-06-03 20:55:51,345 - mmseg - INFO - Iter [1650/10000]  lr: 5.011e-05, eta: 3:13:34, time: 1.296, data_time: 0.080, memory: 7199, decode.loss_ce: 0.1509, decode.acc_seg: 95.5363, loss: 0.1509
2024-06-03 20:56:54,545 - mmseg - INFO - Iter [1700/10000]  lr: 4.981e-05, eta: 3:11:54, time: 1.265, data_time: 0.026, memory: 7199, decode.loss_ce: 0.1341, decode.acc_seg: 96.3573, loss: 0.1341
2024-06-03 20:57:59,156 - mmseg - INFO - Iter [1750/10000]  lr: 4.951e-05, eta: 3:10:22, time: 1.291, data_time: 0.074, memory: 7199, decode.loss_ce: 0.1381, decode.acc_seg: 95.9925, loss: 0.1381
2024-06-03 20:59:01,535 - mmseg - INFO - Iter [1800/10000]  lr: 4.921e-05, eta: 3:08:42, time: 1.248, data_time: 0.029, memory: 7199, decode.loss_ce: 0.1430, decode.acc_seg: 95.7626, loss: 0.1430
2024-06-03 21:00:07,622 - mmseg - INFO - Iter [1850/10000]  lr: 4.891e-05, eta: 3:07:20, time: 1.323, data_time: 0.076, memory: 7199, decode.loss_ce: 0.1400, decode.acc_seg: 95.8467, loss: 0.1400
2024-06-03 21:01:05,170 - mmseg - INFO - Iter [1900/10000]  lr: 4.861e-05, eta: 3:05:22, time: 1.151, data_time: 0.022, memory: 7199, decode.loss_ce: 0.1266, decode.acc_seg: 96.1088, loss: 0.1266
2024-06-03 21:02:07,085 - mmseg - INFO - Iter [1950/10000]  lr: 4.831e-05, eta: 3:03:46, time: 1.239, data_time: 0.068, memory: 7199, decode.loss_ce: 0.1277, decode.acc_seg: 95.9606, loss: 0.1277