mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
https://torchsparse.mit.edu
MIT License
1.16k stars 132 forks source link

Help with SparseResNet21D #246

Closed alre5639 closed 7 months ago

alre5639 commented 9 months ago

Hello, I would like to use TorchSparse++ to construct a backbone for training to generating point clouds features, which I will then feed into various heads for different tasks. From what I can tell, I can use the built in SparseResNet21D backbone to do this out of the box but I have a couple of questions

  1. What are the SparseTensor feats that are used as inputs to the model? It makes sense to me the output of the backbone will have features but since I just have xyz and intensity from my lidar measurements do those just become my input feats?
  2. why are there various outputs after calling model(input). I am expecting essentially voxels with associated feature vectors as the output of the backbone, but seem to get 5 different outputs after the forward() call, each of which does seem to be shaped how I described. Is this just providing the feature vectors at each step of the netowrk? If this is the case do I just use output[4]?
zhijian-liu commented 9 months ago
  1. Yes, the input features should be the coordinates and intensities.
  2. Yes, your understanding is correct. The outputs are basically multi-scale features.