some questions about training

MCG-NJU / LinK

[CVPR 2023] LinK: Linear Kernel for LiDAR-based 3D Perception

MIT License

83 stars 6 forks source link

Hi,

Thank you for the great work!

I tried to use pcdet to run Link, I modified SpMiddleResNetFHDELKv3 model. It works, but I had some questions during the debug process.

Funtion small_to_large_v2: (line 84 in ts_elk.py). F.spdevoxelize requires only 3 input elements. So can I change the new_feat = F.spdevoxelize(f, idx_query, weights, kernel_size) to new_feat = F.spdevoxelize(f, idx_query, weights)?
I had nan in training process. When I debug it, I found that here divides by 0 （Funtion small_to_large_v2: new_feat = new_feat[:,:-1] / (new_feat[:,-1:]) ). Can I add a small value here (e.g. 1e-8) ?

When I finished training, I found that Link was a little better (1 mAP) than CenterPoint(3d spconv). Is this normal? Is it possible that the changes I made above are causing poor performance?

Thanks for your attention. The F.spdevoxelize() requires 4 input elements in our modified torchsparse library, named torchsparse-u. Please refer to https://github.com/MCG-NJU/LinK/blob/aa2ff1d5be3516253530e8b0200796416ca9e734/detection/torchsparse-u/torchsparse/nn/functional/devoxelize.py#L96 .

So, if you did not provide the kernel size to F.spdevoxelize, the F.spdevoxelize will apply a default $r=2$, which is possibly mismatched with your specific kernel size in https://github.com/MCG-NJU/LinK/blob/aa2ff1d5be3516253530e8b0200796416ca9e734/detection/det3d/models/utils/ts_elk.py#L87

I guess this may also account for your question 2 and the performance issue.

If having more questions, please let me know.

MCG-NJU / LinK

some questions about training #1