tf3d - Githubissues

Number1JT commented 3 years ago

Can you provide pretrained model on waymo dataset and some guides for running the inference code? Although there is README file for using the sparse conv, i still stuck in runing the inference code. I want to reproduce the reported 12ms inference time @HRLTY

afathi3 commented 3 years ago

Hi there,

We have updated the README and it has more clear instructions now. We will add pretrained models soon. @HRLTY

soldierofhell commented 3 years ago

@afathi3, @HRLTY, any update on pretrained models?

afathi3 commented 3 years ago

We still plan to add pertained models soon. It should be hopefully within a month. Sorry about the inconvenience.

xuzhang5788 commented 3 years ago

Can I use it for 3D image classification? Such as keras example '3D Image Classification from CT Scans' (https://keras.io/examples/vision/3D_image_classification/). Hopefully, you could give more examples to show how to use it.

afathi3 commented 3 years ago

@xuzhang5788 this is an interesting question. We have never tried it on 3D medical images, but in theory why not if you can turn such data into a sparse representation. Otherwise, dense 3d convolutions might be more appropriate.

xuzhang5788 commented 3 years ago

@afathi3 Thank you for your fast response. My 3D data is very sparse, (24x24x24) with 31 channels. but I don't know how to turn data into a sparse representation. Do you have some examples? Where can I find the release code that can reproduce the results of your experiments in the paper, such as the ModelNet-40 dataset? In addition, the paper is in 2017, I am curious why you release the code until now? Many thanks

afathi3 commented 3 years ago

@xuzhang5788 I am sorry but I am not sure if we mean the same thing by sparse. What I mean is that if most of the voxels in your 3D grid are somehow empty and you can afford to only keep a small set of voxels. Given that your grid is only 24x24x24, you can easily fit a dense 3D network like S3D or I3D in memory. I am not sure which paper is 2017, but this codebase contains the code that we have used in our recent papers. Please refer to DOPS: Learning to detect 3D objects and predict their 3D shapes

xuzhang5788 commented 3 years ago

@afathi3 Thank you so much for your response. Do you mean that a dense 3D network is normally a better choice if I can fit my model in memory? Although I only have 24x24x24 grid, I have 31 channels. I want a bigger cubic size if possible, but I worried about the limitation of memory. Maybe I can fit the model in memory, but I have to use a small batch size to train my model. I am dealing with chemical molecules, so most of the values on the grid are zero. So, I am interested in your work.

I found your work from the blog "https://www.analyticsvidhya.com/blog/2021/02/introduction-to-tensorflow-3d-for-3d-scene-understanding-by-google-ai/". It mentioned the paper "Submanifold Sparse Convolutional Networks". I mistakenly think that is your paper. Sorry about that.

KangchengLiu commented 3 years ago

Thanks for your great work of tf3D, could you please provide the visualization scripts of detection and semantic/instance segmentation for the datasets you tested (Waymo Open Dataset, Scannet Dataset, Rio Dataset)? Many thanks!

afathi3 commented 3 years ago

@xuzhang5788 If you have voxels that are empty, then this model should be appropriate for it. However, you have data in all voxels, but some of the feature dimensions are 0 then this is not a good solution for you. If you can fit everything in memory, dense model could be better, since the information can propagate easier across the grid. This is the function for turning the point cloud into a sparse voxel grid.

afathi3 commented 3 years ago

@KangchengLiu I hope to release checkpoints for some of the datasets, together with code that runs those checkpoints and visualizes them soon (in 3-4 weeks from now).

xuzhang5788 commented 3 years ago

@afathi3 Thank you so much for your great advice.

google-research / google-research

tf3d #477