dvlab-research / UVTR

Unifying Voxel-based Representation with Transformer for 3D Object Detection (NeurIPS 2022)
227 stars 18 forks source link

AttributeError: 'list' object has no attribute 'new_zeros' #22

Open shb9793 opened 1 year ago

shb9793 commented 1 year ago

Sorry to bother again. I have tried to reimplement your LiDAR-based model on KITTI-like dataset. But after the first epoch, the error occurs as follows:

2022-11-16 17:11:58,335 - mmdet - INFO - workflow: [('train', 1)], max: 40 epochs
/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/functional.py:3981: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
  warnings.warn(
2022-11-16 17:13:23,481 - mmdet - INFO - Epoch [1][50/2521] lr: 2.000e-05, eta: 1 day, 23:24:53, time: 1.694, data_time: 0.245, memory: 21946, loss_cls: 0.9841, loss_bbox: 2.1871, d0.loss_cls: 1.4273, d0.loss_bbox: 2.6728, d1.loss_cls: 1.1218, d1.loss_bbox: 2.3758, loss: 10.7690, grad_norm: 51.1293
2022-11-16 17:14:37,135 - mmdet - INFO - Epoch [1][100/2521]    lr: 2.000e-05, eta: 1 day, 20:18:22, time: 1.473, data_time: 0.014, memory: 21984, loss_cls: 0.4211, loss_bbox: 1.6809, d0.loss_cls: 0.4380, d0.loss_bbox: 2.1764, d1.loss_cls: 0.4186, d1.loss_bbox: 1.8243, loss: 6.9592, grad_norm: 42.8634
2022-11-16 17:15:51,275 - mmdet - INFO - Epoch [1][150/2521]    lr: 2.001e-05, eta: 1 day, 19:20:50, time: 1.483, data_time: 0.013, memory: 21984, loss_cls: 0.4054, loss_bbox: 1.5339, d0.loss_cls: 0.4055, d0.loss_bbox: 1.8876, d1.loss_cls: 0.4038, d1.loss_bbox: 1.6010, loss: 6.2371, grad_norm: 95.1544
2022-11-16 17:17:05,245 - mmdet - INFO - Epoch [1][200/2521]    lr: 2.001e-05, eta: 1 day, 18:50:00, time: 1.479, data_time: 0.011, memory: 21984, loss_cls: 0.3977, loss_bbox: 1.4369, d0.loss_cls: 0.4028, d0.loss_bbox: 1.7172, d1.loss_cls: 0.4005, d1.loss_bbox: 1.4918, loss: 5.8469, grad_norm: 119.0783
2022-11-16 17:18:19,204 - mmdet - INFO - Epoch [1][250/2521]    lr: 2.002e-05, eta: 1 day, 18:30:57, time: 1.479, data_time: 0.012, memory: 21984, loss_cls: 0.3829, loss_bbox: 1.3625, d0.loss_cls: 0.4055, d0.loss_bbox: 1.6472, d1.loss_cls: 0.3987, d1.loss_bbox: 1.4512, loss: 5.6480, grad_norm: 135.1175
2022-11-16 17:19:33,492 - mmdet - INFO - Epoch [1][300/2521]    lr: 2.002e-05, eta: 1 day, 18:19:40, time: 1.486, data_time: 0.013, memory: 21984, loss_cls: 0.3688, loss_bbox: 1.3312, d0.loss_cls: 0.4055, d0.loss_bbox: 1.6500, d1.loss_cls: 0.3922, d1.loss_bbox: 1.4440, loss: 5.5917, grad_norm: 139.5463
2022-11-16 17:20:47,706 - mmdet - INFO - Epoch [1][350/2521]    lr: 2.003e-05, eta: 1 day, 18:10:54, time: 1.484, data_time: 0.012, memory: 22019, loss_cls: 0.3540, loss_bbox: 1.2724, d0.loss_cls: 0.3996, d0.loss_bbox: 1.5833, d1.loss_cls: 0.3783, d1.loss_bbox: 1.3937, loss: 5.3813, grad_norm: 163.9016
2022-11-16 17:22:01,765 - mmdet - INFO - Epoch [1][400/2521]    lr: 2.004e-05, eta: 1 day, 18:03:22, time: 1.481, data_time: 0.012, memory: 22019, loss_cls: 0.3373, loss_bbox: 1.2172, d0.loss_cls: 0.3936, d0.loss_bbox: 1.5230, d1.loss_cls: 0.3601, d1.loss_bbox: 1.3354, loss: 5.1668, grad_norm: 174.4764
2022-11-16 17:23:15,944 - mmdet - INFO - Epoch [1][450/2521]    lr: 2.006e-05, eta: 1 day, 17:57:41, time: 1.484, data_time: 0.012, memory: 22019, loss_cls: 0.3214, loss_bbox: 1.2013, d0.loss_cls: 0.4001, d0.loss_bbox: 1.5164, d1.loss_cls: 0.3453, d1.loss_bbox: 1.3183, loss: 5.1028, grad_norm: 188.8877
2022-11-16 17:24:29,580 - mmdet - INFO - Epoch [1][500/2521]    lr: 2.007e-05, eta: 1 day, 17:51:05, time: 1.473, data_time: 0.012, memory: 22019, loss_cls: 0.3037, loss_bbox: 1.1773, d0.loss_cls: 0.3995, d0.loss_bbox: 1.5013, d1.loss_cls: 0.3298, d1.loss_bbox: 1.2953, loss: 5.0068, grad_norm: 197.3203
2022-11-16 17:25:43,725 - mmdet - INFO - Epoch [1][550/2521]    lr: 2.008e-05, eta: 1 day, 17:46:59, time: 1.483, data_time: 0.011, memory: 22019, loss_cls: 0.2847, loss_bbox: 1.1176, d0.loss_cls: 0.3936, d0.loss_bbox: 1.4716, d1.loss_cls: 0.3031, d1.loss_bbox: 1.2501, loss: 4.8207, grad_norm: 190.2340
2022-11-16 17:26:57,780 - mmdet - INFO - Epoch [1][600/2521]    lr: 2.010e-05, eta: 1 day, 17:43:08, time: 1.481, data_time: 0.012, memory: 22019, loss_cls: 0.2828, loss_bbox: 1.0936, d0.loss_cls: 0.3934, d0.loss_bbox: 1.4779, d1.loss_cls: 0.2948, d1.loss_bbox: 1.2363, loss: 4.7790, grad_norm: 198.6171
2022-11-16 17:28:11,650 - mmdet - INFO - Epoch [1][650/2521]    lr: 2.011e-05, eta: 1 day, 17:39:12, time: 1.477, data_time: 0.013, memory: 22019, loss_cls: 0.2745, loss_bbox: 1.0336, d0.loss_cls: 0.3889, d0.loss_bbox: 1.4437, d1.loss_cls: 0.2831, d1.loss_bbox: 1.1817, loss: 4.6055, grad_norm: 174.9819
2022-11-16 17:29:26,145 - mmdet - INFO - Epoch [1][700/2521]    lr: 2.013e-05, eta: 1 day, 17:37:08, time: 1.490, data_time: 0.013, memory: 22019, loss_cls: 0.2646, loss_bbox: 1.0278, d0.loss_cls: 0.3832, d0.loss_bbox: 1.4581, d1.loss_cls: 0.2719, d1.loss_bbox: 1.1750, loss: 4.5805, grad_norm: 196.6794
2022-11-16 17:30:40,141 - mmdet - INFO - Epoch [1][750/2521]    lr: 2.015e-05, eta: 1 day, 17:34:05, time: 1.480, data_time: 0.013, memory: 22019, loss_cls: 0.2662, loss_bbox: 0.9865, d0.loss_cls: 0.3821, d0.loss_bbox: 1.4307, d1.loss_cls: 0.2684, d1.loss_bbox: 1.1346, loss: 4.4686, grad_norm: 228.7645
2022-11-16 17:31:54,235 - mmdet - INFO - Epoch [1][800/2521]    lr: 2.017e-05, eta: 1 day, 17:31:27, time: 1.482, data_time: 0.012, memory: 22019, loss_cls: 0.2662, loss_bbox: 0.9459, d0.loss_cls: 0.3800, d0.loss_bbox: 1.4285, d1.loss_cls: 0.2684, d1.loss_bbox: 1.0973, loss: 4.3862, grad_norm: 208.7490
2022-11-16 17:33:08,368 - mmdet - INFO - Epoch [1][850/2521]    lr: 2.020e-05, eta: 1 day, 17:29:04, time: 1.483, data_time: 0.012, memory: 22019, loss_cls: 0.2593, loss_bbox: 0.9391, d0.loss_cls: 0.3730, d0.loss_bbox: 1.4255, d1.loss_cls: 0.2590, d1.loss_bbox: 1.0849, loss: 4.3407, grad_norm: 221.2812
2022-11-16 17:34:22,173 - mmdet - INFO - Epoch [1][900/2521]    lr: 2.022e-05, eta: 1 day, 17:26:12, time: 1.476, data_time: 0.014, memory: 22019, loss_cls: 0.2561, loss_bbox: 0.9201, d0.loss_cls: 0.3678, d0.loss_bbox: 1.4339, d1.loss_cls: 0.2527, d1.loss_bbox: 1.0674, loss: 4.2980, grad_norm: 232.0119
2022-11-16 17:35:36,218 - mmdet - INFO - Epoch [1][950/2521]    lr: 2.025e-05, eta: 1 day, 17:23:56, time: 1.481, data_time: 0.012, memory: 22019, loss_cls: 0.2566, loss_bbox: 0.8935, d0.loss_cls: 0.3706, d0.loss_bbox: 1.4161, d1.loss_cls: 0.2546, d1.loss_bbox: 1.0344, loss: 4.2258, grad_norm: 209.8056
2022-11-16 17:36:50,186 - mmdet - INFO - Exp name: uvtr_lidar_v005_h5_dair_base.py
2022-11-16 17:36:50,186 - mmdet - INFO - Epoch [1][1000/2521]   lr: 2.027e-05, eta: 1 day, 17:21:38, time: 1.479, data_time: 0.012, memory: 22019, loss_cls: 0.2621, loss_bbox: 0.8907, d0.loss_cls: 0.3714, d0.loss_bbox: 1.4253, d1.loss_cls: 0.2594, d1.loss_bbox: 1.0270, loss: 4.2359, grad_norm: 195.6460
2022-11-16 17:38:04,256 - mmdet - INFO - Epoch [1][1050/2521]   lr: 2.030e-05, eta: 1 day, 17:19:36, time: 1.481, data_time: 0.012, memory: 22019, loss_cls: 0.2440, loss_bbox: 0.8606, d0.loss_cls: 0.3578, d0.loss_bbox: 1.3989, d1.loss_cls: 0.2406, d1.loss_bbox: 0.9983, loss: 4.1002, grad_norm: 228.3762
2022-11-16 17:39:18,318 - mmdet - INFO - Epoch [1][1100/2521]   lr: 2.033e-05, eta: 1 day, 17:17:38, time: 1.481, data_time: 0.011, memory: 22019, loss_cls: 0.2528, loss_bbox: 0.8443, d0.loss_cls: 0.3680, d0.loss_bbox: 1.3890, d1.loss_cls: 0.2517, d1.loss_bbox: 0.9703, loss: 4.0760, grad_norm: 228.4887
2022-11-16 17:40:32,249 - mmdet - INFO - Epoch [1][1150/2521]   lr: 2.036e-05, eta: 1 day, 17:15:32, time: 1.479, data_time: 0.012, memory: 22019, loss_cls: 0.2454, loss_bbox: 0.8432, d0.loss_cls: 0.3524, d0.loss_bbox: 1.3916, d1.loss_cls: 0.2417, d1.loss_bbox: 0.9657, loss: 4.0400, grad_norm: 193.1862
2022-11-16 17:41:46,378 - mmdet - INFO - Epoch [1][1200/2521]   lr: 2.039e-05, eta: 1 day, 17:13:47, time: 1.483, data_time: 0.011, memory: 22019, loss_cls: 0.2477, loss_bbox: 0.8358, d0.loss_cls: 0.3528, d0.loss_bbox: 1.3768, d1.loss_cls: 0.2444, d1.loss_bbox: 0.9497, loss: 4.0072, grad_norm: 183.7507
2022-11-16 17:43:00,171 - mmdet - INFO - Epoch [1][1250/2521]   lr: 2.043e-05, eta: 1 day, 17:11:38, time: 1.476, data_time: 0.013, memory: 22019, loss_cls: 0.2449, loss_bbox: 0.8308, d0.loss_cls: 0.3465, d0.loss_bbox: 1.3713, d1.loss_cls: 0.2445, d1.loss_bbox: 0.9499, loss: 3.9879, grad_norm: 205.1622
2022-11-16 17:44:14,310 - mmdet - INFO - Epoch [1][1300/2521]   lr: 2.046e-05, eta: 1 day, 17:09:59, time: 1.483, data_time: 0.012, memory: 22019, loss_cls: 0.2431, loss_bbox: 0.8084, d0.loss_cls: 0.3487, d0.loss_bbox: 1.3483, d1.loss_cls: 0.2385, d1.loss_bbox: 0.9131, loss: 3.9001, grad_norm: 194.9144
2022-11-16 17:45:28,180 - mmdet - INFO - Epoch [1][1350/2521]   lr: 2.050e-05, eta: 1 day, 17:08:02, time: 1.477, data_time: 0.011, memory: 22019, loss_cls: 0.2390, loss_bbox: 0.8170, d0.loss_cls: 0.3428, d0.loss_bbox: 1.3517, d1.loss_cls: 0.2329, d1.loss_bbox: 0.9203, loss: 3.9038, grad_norm: 218.2914
2022-11-16 17:46:42,190 - mmdet - INFO - Epoch [1][1400/2521]   lr: 2.053e-05, eta: 1 day, 17:06:19, time: 1.480, data_time: 0.013, memory: 22019, loss_cls: 0.2407, loss_bbox: 0.8077, d0.loss_cls: 0.3378, d0.loss_bbox: 1.3599, d1.loss_cls: 0.2365, d1.loss_bbox: 0.9170, loss: 3.8997, grad_norm: 183.4504
2022-11-16 17:47:56,515 - mmdet - INFO - Epoch [1][1450/2521]   lr: 2.057e-05, eta: 1 day, 17:04:59, time: 1.486, data_time: 0.014, memory: 22019, loss_cls: 0.2392, loss_bbox: 0.8242, d0.loss_cls: 0.3317, d0.loss_bbox: 1.3597, d1.loss_cls: 0.2396, d1.loss_bbox: 0.9319, loss: 3.9264, grad_norm: 193.3301
2022-11-16 17:49:10,456 - mmdet - INFO - Epoch [1][1500/2521]   lr: 2.061e-05, eta: 1 day, 17:03:14, time: 1.479, data_time: 0.012, memory: 22019, loss_cls: 0.2360, loss_bbox: 0.8014, d0.loss_cls: 0.3198, d0.loss_bbox: 1.3446, d1.loss_cls: 0.2322, d1.loss_bbox: 0.9105, loss: 3.8444, grad_norm: 199.8085
2022-11-16 17:50:24,102 - mmdet - INFO - Epoch [1][1550/2521]   lr: 2.065e-05, eta: 1 day, 17:01:12, time: 1.473, data_time: 0.012, memory: 22019, loss_cls: 0.2332, loss_bbox: 0.7892, d0.loss_cls: 0.3172, d0.loss_bbox: 1.3103, d1.loss_cls: 0.2254, d1.loss_bbox: 0.8903, loss: 3.7657, grad_norm: 173.7445
2022-11-16 17:51:37,885 - mmdet - INFO - Epoch [1][1600/2521]   lr: 2.070e-05, eta: 1 day, 16:59:21, time: 1.476, data_time: 0.012, memory: 22019, loss_cls: 0.2333, loss_bbox: 0.8114, d0.loss_cls: 0.3123, d0.loss_bbox: 1.3273, d1.loss_cls: 0.2297, d1.loss_bbox: 0.9138, loss: 3.8278, grad_norm: 199.8943
2022-11-16 17:52:51,448 - mmdet - INFO - Epoch [1][1650/2521]   lr: 2.074e-05, eta: 1 day, 16:57:20, time: 1.471, data_time: 0.013, memory: 22019, loss_cls: 0.2292, loss_bbox: 0.7962, d0.loss_cls: 0.3172, d0.loss_bbox: 1.3213, d1.loss_cls: 0.2271, d1.loss_bbox: 0.8975, loss: 3.7886, grad_norm: 201.4746
2022-11-16 17:54:05,006 - mmdet - INFO - Epoch [1][1700/2521]   lr: 2.079e-05, eta: 1 day, 16:55:21, time: 1.471, data_time: 0.012, memory: 22019, loss_cls: 0.2382, loss_bbox: 0.8029, d0.loss_cls: 0.3163, d0.loss_bbox: 1.3239, d1.loss_cls: 0.2349, d1.loss_bbox: 0.9031, loss: 3.8194, grad_norm: 161.0498
2022-11-16 17:55:18,464 - mmdet - INFO - Epoch [1][1750/2521]   lr: 2.083e-05, eta: 1 day, 16:53:19, time: 1.469, data_time: 0.012, memory: 22019, loss_cls: 0.2330, loss_bbox: 0.8077, d0.loss_cls: 0.3111, d0.loss_bbox: 1.3382, d1.loss_cls: 0.2298, d1.loss_bbox: 0.9107, loss: 3.8306, grad_norm: 160.1327
2022-11-16 17:56:31,881 - mmdet - INFO - Epoch [1][1800/2521]   lr: 2.088e-05, eta: 1 day, 16:51:17, time: 1.468, data_time: 0.012, memory: 22019, loss_cls: 0.2326, loss_bbox: 0.7828, d0.loss_cls: 0.3100, d0.loss_bbox: 1.2945, d1.loss_cls: 0.2313, d1.loss_bbox: 0.8835, loss: 3.7346, grad_norm: 167.5468
2022-11-16 17:57:45,458 - mmdet - INFO - Epoch [1][1850/2521]   lr: 2.093e-05, eta: 1 day, 16:49:27, time: 1.472, data_time: 0.013, memory: 22019, loss_cls: 0.2278, loss_bbox: 0.7971, d0.loss_cls: 0.3011, d0.loss_bbox: 1.3097, d1.loss_cls: 0.2277, d1.loss_bbox: 0.8909, loss: 3.7543, grad_norm: 150.6342
2022-11-16 17:58:59,027 - mmdet - INFO - Epoch [1][1900/2521]   lr: 2.098e-05, eta: 1 day, 16:47:38, time: 1.471, data_time: 0.012, memory: 22019, loss_cls: 0.2273, loss_bbox: 0.7713, d0.loss_cls: 0.3019, d0.loss_bbox: 1.2775, d1.loss_cls: 0.2264, d1.loss_bbox: 0.8682, loss: 3.6727, grad_norm: 165.1924
2022-11-16 18:00:12,236 - mmdet - INFO - Epoch [1][1950/2521]   lr: 2.103e-05, eta: 1 day, 16:45:33, time: 1.464, data_time: 0.012, memory: 22019, loss_cls: 0.2252, loss_bbox: 0.7895, d0.loss_cls: 0.2985, d0.loss_bbox: 1.3028, d1.loss_cls: 0.2237, d1.loss_bbox: 0.8888, loss: 3.7285, grad_norm: 182.9384
2022-11-16 18:01:25,631 - mmdet - INFO - Exp name: uvtr_lidar_v005_h5_dair_base.py
2022-11-16 18:01:25,636 - mmdet - INFO - Epoch [1][2000/2521]   lr: 2.109e-05, eta: 1 day, 16:43:39, time: 1.468, data_time: 0.013, memory: 22019, loss_cls: 0.2275, loss_bbox: 0.7946, d0.loss_cls: 0.2998, d0.loss_bbox: 1.2840, d1.loss_cls: 0.2266, d1.loss_bbox: 0.8917, loss: 3.7242, grad_norm: 168.2174
2022-11-16 18:02:39,211 - mmdet - INFO - Epoch [1][2050/2521]   lr: 2.114e-05, eta: 1 day, 16:41:57, time: 1.472, data_time: 0.013, memory: 22019, loss_cls: 0.2159, loss_bbox: 0.7594, d0.loss_cls: 0.2909, d0.loss_bbox: 1.2688, d1.loss_cls: 0.2173, d1.loss_bbox: 0.8575, loss: 3.6099, grad_norm: 165.3231
2022-11-16 18:03:52,862 - mmdet - INFO - Epoch [1][2100/2521]   lr: 2.120e-05, eta: 1 day, 16:40:19, time: 1.473, data_time: 0.011, memory: 22019, loss_cls: 0.2234, loss_bbox: 0.7869, d0.loss_cls: 0.2921, d0.loss_bbox: 1.2852, d1.loss_cls: 0.2176, d1.loss_bbox: 0.8808, loss: 3.6860, grad_norm: 165.1741
2022-11-16 18:05:06,507 - mmdet - INFO - Epoch [1][2150/2521]   lr: 2.126e-05, eta: 1 day, 16:38:42, time: 1.473, data_time: 0.013, memory: 22019, loss_cls: 0.2214, loss_bbox: 0.7613, d0.loss_cls: 0.2893, d0.loss_bbox: 1.2568, d1.loss_cls: 0.2207, d1.loss_bbox: 0.8534, loss: 3.6028, grad_norm: 155.9333
2022-11-16 18:06:20,227 - mmdet - INFO - Epoch [1][2200/2521]   lr: 2.132e-05, eta: 1 day, 16:37:09, time: 1.474, data_time: 0.012, memory: 22034, loss_cls: 0.2240, loss_bbox: 0.7575, d0.loss_cls: 0.3003, d0.loss_bbox: 1.2418, d1.loss_cls: 0.2227, d1.loss_bbox: 0.8438, loss: 3.5903, grad_norm: 175.5790
2022-11-16 18:07:33,918 - mmdet - INFO - Epoch [1][2250/2521]   lr: 2.138e-05, eta: 1 day, 16:35:36, time: 1.474, data_time: 0.012, memory: 22034, loss_cls: 0.2296, loss_bbox: 0.7708, d0.loss_cls: 0.2934, d0.loss_bbox: 1.2698, d1.loss_cls: 0.2288, d1.loss_bbox: 0.8680, loss: 3.6603, grad_norm: 168.9581
2022-11-16 18:08:47,585 - mmdet - INFO - Epoch [1][2300/2521]   lr: 2.144e-05, eta: 1 day, 16:34:03, time: 1.473, data_time: 0.013, memory: 22037, loss_cls: 0.2158, loss_bbox: 0.7582, d0.loss_cls: 0.2880, d0.loss_bbox: 1.2530, d1.loss_cls: 0.2133, d1.loss_bbox: 0.8483, loss: 3.5765, grad_norm: 196.4898
2022-11-16 18:10:01,111 - mmdet - INFO - Epoch [1][2350/2521]   lr: 2.150e-05, eta: 1 day, 16:32:24, time: 1.471, data_time: 0.012, memory: 22037, loss_cls: 0.2122, loss_bbox: 0.7493, d0.loss_cls: 0.2837, d0.loss_bbox: 1.2204, d1.loss_cls: 0.2123, d1.loss_bbox: 0.8336, loss: 3.5114, grad_norm: 197.4392
2022-11-16 18:11:14,752 - mmdet - INFO - Epoch [1][2400/2521]   lr: 2.157e-05, eta: 1 day, 16:30:52, time: 1.473, data_time: 0.013, memory: 22037, loss_cls: 0.2184, loss_bbox: 0.7428, d0.loss_cls: 0.2821, d0.loss_bbox: 1.2388, d1.loss_cls: 0.2155, d1.loss_bbox: 0.8361, loss: 3.5336, grad_norm: 191.5447
2022-11-16 18:12:28,253 - mmdet - INFO - Epoch [1][2450/2521]   lr: 2.163e-05, eta: 1 day, 16:29:15, time: 1.470, data_time: 0.011, memory: 22037, loss_cls: 0.2189, loss_bbox: 0.7567, d0.loss_cls: 0.2813, d0.loss_bbox: 1.2494, d1.loss_cls: 0.2155, d1.loss_bbox: 0.8518, loss: 3.5737, grad_norm: 172.0291
2022-11-16 18:13:41,919 - mmdet - INFO - Epoch [1][2500/2521]   lr: 2.170e-05, eta: 1 day, 16:27:44, time: 1.473, data_time: 0.012, memory: 22037, loss_cls: 0.2170, loss_bbox: 0.7464, d0.loss_cls: 0.2834, d0.loss_bbox: 1.2192, d1.loss_cls: 0.2165, d1.loss_bbox: 0.8354, loss: 3.5180, grad_norm: 179.5014
2022-11-16 18:14:13,086 - mmdet - INFO - Saving checkpoint at 1 epochs
[                                                  ] 0/2016, elapsed: 0s, ETA:Traceback (most recent call last):
  File "tools/train.py", line 248, in <module>
    main()
  File "tools/train.py", line 237, in main
    train_model(
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/mmdet3d/apis/train.py", line 28, in train_model
    train_detector(
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmdet/apis/train.py", line 170, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 54, in train
    self.call_hook('after_train_epoch')
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
    getattr(hook, fn_name)(self)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/hooks/evaluation.py", line 237, in after_train_epoch
    self._do_evaluate(runner)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmdet/core/evaluation/eval_hooks.py", line 17, in _do_evaluate
    results = single_gpu_test(runner.model, self.dataloader, show=False)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmdet/apis/test.py", line 27, in single_gpu_test
    result = model(return_loss=False, rescale=True, **data)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 42, in forward
    return super().forward(*inputs, **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 257, in forward
    return self.forward_test(**kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 326, in forward_test
    results = self.simple_test(img_metas[0], points, img[0], **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 345, in simple_test
    pts_feat, img_feats, img_depth = self.extract_feat(points=points, img=img, img_metas=img_metas)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 98, in new_func
    return old_func(*args, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 187, in extract_feat
    pts_feats = self.extract_pts_feat(points, img_feats, img_metas)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/projects_uvtr/mmdet3d_plugin/models/detectors/uvtr.py", line 132, in extract_pts_feat
    voxels, num_points, coors = self.voxelize(pts)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
    return func(*args, **kwargs)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 186, in new_func
    return old_func(*args, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/mmdet3d/models/detectors/mvx_two_stage.py", line 225, in voxelize
    res_voxels, res_coors, res_num_points = self.pts_voxel_layer(res)
  File "/share/home/scz6240/.conda/envs/openmmlab0171/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/mmdet3d/ops/voxel/voxelize.py", line 136, in forward
    return voxelization(input, self.voxel_size, self.point_cloud_range,
  File "/share/home/scz6240/openmmlab0171/mmdetection3d/mmdet3d/ops/voxel/voxelize.py", line 57, in forward
    voxels = points.new_zeros(
AttributeError: 'list' object has no attribute 'new_zeros'
shb9793 commented 1 year ago

And this is my config.

_base_ = [
    '../../../../configs/_base_/datasets/kitti-3d-3class.py',
    '../../../../configs/_base_/schedules/cyclic_40e.py',
    '../../../../configs/_base_/default_runtime.py'
]

plugin=True
plugin_dir='projects_uvtr/mmdet3d_plugin/'

# If point cloud range is changed, the models should also change their point
# cloud range accordingly
point_cloud_range = [0, -40, -3, 70.4, 40, 1]
voxel_size = [0.05, 0.05, 0.1]
fp16_enabled = True
bev_stride = 4
sample_num = 5
# For nuScenes we usually do 10-class detection
class_names = ['Pedestrian', 'Cyclist', 'Car']

input_modality = dict(
    use_lidar=True,
    use_camera=False,
    use_radar=False,
    use_map=False,
    use_external=False)

model = dict(
    type='UVTR',
    pts_voxel_layer=dict(
        max_num_points=5, voxel_size=voxel_size, max_voxels=(16000, 40000),
        point_cloud_range=point_cloud_range),
    pts_voxel_encoder=dict(type='HardSimpleVFE', num_features=4),
    pts_middle_encoder=dict(
        type='SparseEncoderHD',
        in_channels=4,
        sparse_shape=[41, 1600, 1408],
        output_channels=256,
        order=('conv', 'norm', 'act'),
        encoder_channels=((16, 16, 32), (32, 32, 64), (64, 64, 128), (128, 128)),
        encoder_paddings=((0, 0, 1), (0, 0, 1), (0, 0, [0, 1, 1]), (0, 0)),
        block_type='basicblock',
        fp16_enabled=False), # not enable FP16 here
    pts_backbone=dict(
        type='SECOND3D',
        in_channels=[256, 256, 256],
        out_channels=[128, 256, 512],
        layer_nums=[5, 5, 5],
        layer_strides=[1, 2, 4],
        is_cascade=False,
        norm_cfg=dict(type='BN3d', eps=1e-3, momentum=0.01),
        conv_cfg=dict(type='Conv3d', kernel=(1,3,3), bias=False)),
    pts_neck=dict(
        type='SECOND3DFPN',
        in_channels=[128, 256, 512],
        out_channels=[256, 256, 256],
        upsample_strides=[1, 2, 4],
        norm_cfg=dict(type='BN3d', eps=1e-3, momentum=0.01),
        upsample_cfg=dict(type='deconv3d', bias=False),
        extra_conv=dict(type='Conv3d', num_conv=3, bias=False),
        use_conv_for_no_stride=True),
    pts_bbox_head=dict(
        type='UVTRHead',
        # transformer_cfg
        num_query=300,
        num_classes=3,
        in_channels=256,
        sync_cls_avg_factor=True,
        with_box_refine=True,
        as_two_stage=False,
        transformer=dict(
            type='Uni3DDETR',
            fp16_enabled=fp16_enabled,
            decoder=dict(
                type='UniTransformerDecoder',
                num_layers=3,
                return_intermediate=True,
                transformerlayers=dict(
                    type='BaseTransformerLayer',
                    attn_cfgs=[
                        dict(
                            type='MultiheadAttention',
                            embed_dims=256,
                            num_heads=8,
                            dropout=0.1),
                        dict(
                            type='UniCrossAtten',
                            num_points=1,
                            embed_dims=256,
                            num_sweeps=1,
                            fp16_enabled=fp16_enabled)
                    ],
                    ffn_cfgs=dict(
                        type='FFN',
                        embed_dims=256,
                        feedforward_channels=512,
                        num_fcs=2,
                        ffn_drop=0.1,
                        act_cfg=dict(type='ReLU', inplace=True),
                    ),
                    norm_cfg=dict(type='LN'),
                    operation_order=('self_attn', 'norm', 'cross_attn', 'norm',
                                     'ffn', 'norm'))
            )
        ),
        bbox_coder=dict(
            type='NMSFreeCoder',
            post_center_range=[0, -40, -3, 70.4, 40, 1],
            pc_range=point_cloud_range,
            max_num=100,
            voxel_size=voxel_size,
            num_classes=3), 
        positional_encoding=dict(
            type='SinePositionalEncoding',
            num_feats=128,
            normalize=True,
            offset=-0.5),
        loss_cls=dict(
            type='FocalLoss',
            use_sigmoid=True,
            gamma=2.0,
            alpha=0.25,
            loss_weight=2.0),
        loss_bbox=dict(type='L1Loss', loss_weight=0.25),
        loss_iou=dict(type='GIoULoss', loss_weight=0.0),
        code_weights=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
    ),
    # model training and testing settings
    train_cfg=dict(pts=dict(
        grid_size=[1408, 1600, 40],
        voxel_size=voxel_size,
        point_cloud_range=point_cloud_range,
        out_size_factor=bev_stride,
        assigner=dict(
            type='HungarianAssigner3D',
            cls_cost=dict(type='FocalLossCost', weight=2.0),
            reg_cost=dict(type='BBox3DL1Cost', weight=0.25),
            iou_cost=dict(type='IoUCost', weight=0.0), # Fake cost. This is just to make it compatible with DETR head. 
            pc_range=point_cloud_range))))

dataset_type = 'KittiDataset'
data_root = '/share/home/scz6240/openmmlab0171/DAIR-V2X-Dataset/single-infrastructure-side/'

file_client_args = dict(backend='disk')

db_sampler = dict(
    type='UnifiedDataBaseSampler',
    data_root=data_root,
    info_path=data_root + 'kitti_dbinfos_train.pkl', # please change to your own database file
    rate=1.0,
    prepare=dict(
        filter_by_difficulty=[-1],
        filter_by_min_points=dict(Car=5, Pedestrian=10, Cyclist=10)),
    classes=class_names,
    sample_groups=dict(Car=12, Pedestrian=10, Cyclist=10),
    points_loader=dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=4,
        use_dim=[0, 1, 2, 3],
        file_client_args=file_client_args))

train_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=4,
        use_dim=4,
        file_client_args=dict(backend='disk')),
    dict(
        type='LoadAnnotations3D',
        with_bbox_3d=True,
        with_label_3d=True,
        file_client_args=dict(backend='disk')),
    dict(
        type='ObjectSample',
        db_sampler=dict(
            data_root=data_root,
            info_path=data_root + 'kitti_dbinfos_train.pkl',
            rate=1.0,
            prepare=dict(
                filter_by_difficulty=[-1],
                filter_by_min_points=dict(Car=5, Pedestrian=10, Cyclist=10)),
            classes=['Pedestrian', 'Cyclist', 'Car'],
            sample_groups=dict(Car=12, Pedestrian=10, Cyclist=10))),
    dict(
        type='ObjectNoise',
        num_try=100,
        translation_std=[1.0, 1.0, 0.5],
        global_rot_range=[0.0, 0.0],
        rot_range=[-0.78539816, 0.78539816]),
    dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
    dict(
        type='GlobalRotScaleTrans',
        rot_range=[-0.78539816, 0.78539816],
        scale_ratio_range=[0.95, 1.05]),
    dict(
        type='PointsRangeFilter', point_cloud_range=[0, -40, -3, 70.4, 40, 1]),
    dict(
        type='ObjectRangeFilter', point_cloud_range=[0, -40, -3, 70.4, 40, 1]),
    dict(type='PointShuffle'),
    dict(
        type='DefaultFormatBundle3D',
        class_names=['Pedestrian', 'Cyclist', 'Car']),
    dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]

test_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=4,
        use_dim=4,
        file_client_args=dict(backend='disk')),
    dict(
        type='MultiScaleFlipAug3D',
        img_scale=(1333, 800),
        pts_scale_ratio=1,
        flip=False,
        transforms=[
            dict(
                type='GlobalRotScaleTrans',
                rot_range=[0, 0],
                scale_ratio_range=[1.0, 1.0],
                translation_std=[0, 0, 0]),
            dict(type='RandomFlip3D'),
            dict(
                type='PointsRangeFilter',
                point_cloud_range=[0, -40, -3, 70.4, 40, 1]),
            dict(
                type='DefaultFormatBundle3D',
                class_names=['Pedestrian', 'Cyclist', 'Car'],
                with_label=False),
            dict(type='Collect3D', keys=['points'])
        ])
]

# construct a pipeline for data and gt loading in show function
# please keep its loading function consistent with test_pipeline (e.g. client)
eval_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=4,
        use_dim=4,
        file_client_args=dict(backend='disk')),
    dict(
        type='DefaultFormatBundle3D',
        class_names=['Pedestrian', 'Cyclist', 'Car'],
        with_label=False),
    dict(type='Collect3D', keys=['points'])
]

data = dict(
    samples_per_gpu=4,
    workers_per_gpu=8,
    train=dict(
        type='RepeatDataset',
        times=2,
        dataset=dict(
            type='KittiDataset',
            data_root='/share/home/scz6240/openmmlab0171/DAIR-V2X-Dataset/single-infrastructure-side/',
            ann_file=data_root + 'kitti_infos_train.pkl',
            split='training',
            pts_prefix='velodyne_reduced',
            pipeline=[
                dict(
                    type='LoadPointsFromFile',
                    coord_type='LIDAR',
                    load_dim=4,
                    use_dim=4,
                    file_client_args=dict(backend='disk')),
                dict(
                    type='LoadAnnotations3D',
                    with_bbox_3d=True,
                    with_label_3d=True,
                    file_client_args=dict(backend='disk')),
                dict(
                    type='ObjectSample',
                    db_sampler=dict(
                        data_root='/share/home/scz6240/openmmlab0171/DAIR-V2X-Dataset/single-infrastructure-side/',
                        info_path=data_root + 'kitti_dbinfos_train.pkl',
                        rate=1.0,
                        prepare=dict(
                            filter_by_difficulty=[-1],
                            filter_by_min_points=dict(
                                Car=5, Pedestrian=10, Cyclist=10)),
                        classes=['Pedestrian', 'Cyclist', 'Car'],
                        sample_groups=dict(Car=12, Pedestrian=10,
                                           Cyclist=10))),
                dict(
                    type='ObjectNoise',
                    num_try=100,
                    translation_std=[1.0, 1.0, 0.5],
                    global_rot_range=[0.0, 0.0],
                    rot_range=[-0.78539816, 0.78539816]),
                dict(type='RandomFlip3D', flip_ratio_bev_horizontal=0.5),
                dict(
                    type='GlobalRotScaleTrans',
                    rot_range=[-0.78539816, 0.78539816],
                    scale_ratio_range=[0.95, 1.05]),
                dict(
                    type='PointsRangeFilter',
                    point_cloud_range=[0, -40, -3, 70.4, 40, 1]),
                dict(
                    type='ObjectRangeFilter',
                    point_cloud_range=[0, -40, -3, 70.4, 40, 1]),
                dict(type='PointShuffle'),
                dict(
                    type='DefaultFormatBundle3D',
                    class_names=['Pedestrian', 'Cyclist', 'Car']),
                dict(
                    type='Collect3D',
                    keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
            ],
            modality=dict(use_lidar=True, use_camera=False),
            classes=['Pedestrian', 'Cyclist', 'Car'],
            test_mode=False,
            box_type_3d='LiDAR')),
    val=dict(
        type='KittiDataset',
        data_root='/share/home/scz6240/openmmlab0171/DAIR-V2X-Dataset/single-infrastructure-side/',
        ann_file=data_root + 'kitti_infos_val.pkl',
        split='training',
        pts_prefix='velodyne_reduced',
        pipeline=[
            dict(
                type='LoadPointsFromFile',
                coord_type='LIDAR',
                load_dim=4,
                use_dim=4,
                file_client_args=dict(backend='disk')),
            dict(
                type='MultiScaleFlipAug3D',
                img_scale=(1333, 800),
                pts_scale_ratio=1,
                flip=False,
                transforms=[
                    dict(
                        type='GlobalRotScaleTrans',
                        rot_range=[0, 0],
                        scale_ratio_range=[1.0, 1.0],
                        translation_std=[0, 0, 0]),
                    dict(type='RandomFlip3D'),
                    dict(
                        type='PointsRangeFilter',
                        point_cloud_range=[0, -40, -3, 70.4, 40, 1]),
                    dict(
                        type='DefaultFormatBundle3D',
                        class_names=['Pedestrian', 'Cyclist', 'Car'],
                        with_label=False),
                    dict(type='Collect3D', keys=['points'])
                ])
        ],
        modality=dict(use_lidar=True, use_camera=False),
        classes=['Pedestrian', 'Cyclist', 'Car'],
        test_mode=True,
        box_type_3d='LiDAR'),
    test=dict(
        type='KittiDataset',
        data_root='/share/home/scz6240/openmmlab0171/DAIR-V2X-Dataset/single-infrastructure-side/',
        ann_file=data_root + 'kitti_infos_val.pkl',
        split='training',
        pts_prefix='velodyne_reduced',
        pipeline=[
            dict(
                type='LoadPointsFromFile',
                coord_type='LIDAR',
                load_dim=4,
                use_dim=4,
                file_client_args=dict(backend='disk')),
            dict(
                type='MultiScaleFlipAug3D',
                img_scale=(1333, 800),
                pts_scale_ratio=1,
                flip=False,
                transforms=[
                    dict(
                        type='GlobalRotScaleTrans',
                        rot_range=[0, 0],
                        scale_ratio_range=[1.0, 1.0],
                        translation_std=[0, 0, 0]),
                    dict(type='RandomFlip3D'),
                    dict(
                        type='PointsRangeFilter',
                        point_cloud_range=[0, -40, -3, 70.4, 40, 1]),
                    dict(
                        type='DefaultFormatBundle3D',
                        class_names=['Pedestrian', 'Cyclist', 'Car'],
                        with_label=False),
                    dict(type='Collect3D', keys=['points'])
                ])
        ],
        modality=dict(use_lidar=True, use_camera=False),
        classes=['Pedestrian', 'Cyclist', 'Car'],
        test_mode=True,
        box_type_3d='LiDAR'))

evaluation = dict(
    interval=1,
    pipeline=[
        dict(
            type='LoadPointsFromFile',
            coord_type='LIDAR',
            load_dim=4,
            use_dim=4,
            file_client_args=dict(backend='disk')),
        dict(
            type='DefaultFormatBundle3D',
            class_names=['Pedestrian', 'Cyclist', 'Car'],
            with_label=False),
        dict(type='Collect3D', keys=['points'])
    ])

checkpoint_config = dict(interval=1)
runner = dict(type='EpochBasedRunner', max_epochs=40)

optimizer = dict(type='AdamW', lr=2e-5, weight_decay=0.01)
optimizer_config = dict(grad_clip=dict(max_norm=10, norm_type=2))
work_dir = '/share/home/scz6240/openmmlab0171/mmdetection3d/work_dir/uvtr_dair'
find_unused_parameters = True
workflow = [('train', 1)]
gpu_ids = range(0, 1)
dist_params = dict(backend='nccl')
log_level = 'INFO'

# fp16 setting
fp16 = dict(loss_scale=32.)
shb9793 commented 1 year ago

It seems like fp16 error, but I don't know how to settle this issue. Looking forward to your critical suggestions. Many thanks!

yanwei-li commented 1 year ago

Hi, I guess you should check the data format of points in this function results = self.simple_test(img_metas[0], points, img[0], **kwargs). It seems the data of points is list here. It's better to modify it according to your data loader.

shb9793 commented 1 year ago

Excuse me, I don't know how to modify the data format of input points. Could you give me some suggestions, please