Closed chinmay5 closed 11 months ago
@chinmay5 if you're using 3d conv for video, you may mimic writing our _get_active_ex_or_ii
function as 3D version, e.g., call repeat_interleave for 3 times. _cur_active
would be a BCHWT-like binary tensor, which means you should modify https://github.com/keyu-tian/SparK/blob/main/pretrain/spark.py#L81 to create a BCHWT-like binary mask.
If it is for 3d point cloud or 3d sparse voxel processing, i would recommend you to use the sparse conv of https://github.com/NVIDIA/MinkowskiEngine rather than ours for efficiency.
Thank you so much for your input
Thank you so much for sharing the code base. I was wondering how to apply 3d convolution using the setup. I think we need to update the section
However, I am not certain. It would be great to know your opinion. Any suggestions about things to keep in mind during the implementation of 3D conv would be highly appreciated.