Congratulations and Request for Additional Information on Your Research

NotACracker / COTR

[CVPR24] COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction

Apache License 2.0

47 stars 4 forks source link

Congratulations and Request for Additional Information on Your Research #3

Open SPA-junghokim opened 4 months ago

SPA-junghokim commented 4 months ago

I want to extend my heartfelt thanks for sharing your outstanding research with us. Congratulations on your acceptance to CVPR 2024; that’s an impressive achievement!

Inspired by your excellent work, we are eager to contribute to occupancy prediction research and are planning some experiments in this area. To assist with our project, I have posted an issue on your repository. Could you possibly share the performance details of each model configuration you’ve posted? Additionally, would it be possible for you to share the checkpoints as well?

Thank you for your time and assistance.

Best regards, Jungho Kim

SPA-junghokim commented 4 months ago

Further question.

_dim_ is quite large than _numCTrans

numC_Trans = 32
\_dim\_ = 256

_dim_ has a considerably higher number of channels compared to numCTrans, which results in a significant increase in memory usage for voxels. However, BEVdetocc has shown that it is capable of performing tasks with just a 32-channel dimension. Does using up to a 256-channel dimension significantly affect performance? Are there any results from experiments conducted with reducing the _dim\ size to 32?

NotACracker commented 3 months ago

Further question.

dim is quite large than _numCTrans
numC_Trans = 32
\_dim\_ = 256
dim has a considerably higher number of channels compared to numC_Trans, which results in a significant increase in memory usage for voxels. However, BEVdetocc has shown that it is capable of performing tasks with just a 32-channel dimension. Does using up to a 256-channel dimension significantly affect performance? Are there any results from experiments conducted with reducing the dim size to 32?

32-channel is used for high-resolution OCC features (200*200*16), while the 256-channel is suitable for compact OCC features (50*50*16). We have not experimented with lower channel dimension, but I believe that for compact OCC features, there could be a lower yet effective channel dimension.

SPA-junghokim commented 2 months ago

To assist with our project, I have posted an issue on your repository. Could you possibly share the performance details of each model configuration you’ve posted? Additionally, would it be possible for you to share the checkpoints as well?

Thank you for your time and assistance.