dvlab-research / LargeKernel3D

LargeKernel3D: Scaling up Kernels in 3D Sparse CNNs (CVPR 2023)
https://arxiv.org/abs/2206.10555
Apache License 2.0
197 stars 8 forks source link

'trunc_normal_' import error #7

Closed fjzpcmj closed 1 year ago

fjzpcmj commented 1 year ago

https://github.com/dvlab-research/LargeKernel3D/blob/ca786a7a9fa6531db39da9b4eb0dc5149bbdc312/object-detection/det3d/models/backbones/scn_largekernel.py#L132 when runing detection with spatialgroupconvv2, there will be error of "NameError: name 'truncnormal' is not defined"

yukang2017 commented 1 year ago

Hi,

Thanks. I fixed it with this line.

https://github.com/dvlab-research/LargeKernel3D/blob/518ddbe64aa79285fbe920f15ad7ca0207416ae4/object-detection/det3d/models/backbones/scn_largekernel.py#L10

Regards, Yukang Chen

fjzpcmj commented 1 year ago

@yukang2017 Dear Author, Thanks for your reply. I am training detection model with config "nusc_centerpoint_voxelnet_0075voxel_fix_bn_z_largekernel3d_large,py" in two V100 GPUs. The conv type is 'spatialgroupconvv2'. It seems that it will cost 12days to train the model with 20 epochs. When trained with four V100 GPUs, the cost time is also 12 days. Is this normal?It will be very nice if you can share your traning logs.

here is my traing log with 2 GPUs: 2023-05-08 19:00:44,415 - INFO - Epoch [1/20][510/30895] lr: 0.00010, eta: 11 days, 17:20:45, time: 1.361, data_time: 0.121, transfer_time: 0.019, forward_time: 0.327, loss_parse_time: 0.001 memory: 7336,

here is my traing log with 4 GPUs: 2023-05-09 10:22:58,283 - INFO - Epoch [1/20][100/15448] lr: 0.00010, eta: 14 days, 20:59:34, time: 3.149, data_time: 0.254, transfer_time: 0.015, forward_time: 0.333, loss_parse_time: 0.000 memory: 6913,