Lizhuoling / VoxelFormer-public

This is the official implementation of the paper VoxelFormer. Code will come soon.
Apache License 2.0
30 stars 2 forks source link

Speed and parameters about the model #1

Open zen-d opened 1 year ago

zen-d commented 1 year ago

@Lizhuoling Hi, thanks for your great work. Regarding Table 4, I wonder why VoxelFormer looks more lightweight than BEVFormer. I think if all the architectural specifications are aligned between them, VoxelFromer actually adds a depth prediction net (borrowed from BEVDepth). The inference time and parameter volumes might slightly increase. Correct me if I miss something.

Lizhuoling commented 1 year ago

Thanks for your concern. First of all, the depth head is much slighter than the one in BEVDepth. Secondly, there are some other differences between BEVFormer and VoxelFormer. For example, BEVFormer samples feature from 4 FPN levels, while we only use 1 level. Besides, BEVFormer needs to predict the sampling offsets, while VoxelFormer does not predict them. Due to these issues, VoxelFormer behaves faster than BEVFormer and contains fewer parameters.