QingyongHu / RandLA-Net

🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)
Other
1.32k stars 321 forks source link

Model Inference time #65

Open abhigoku10 opened 4 years ago

abhigoku10 commented 4 years ago

@QingyongHu Hi thanks open sourcing the code, i have few queries

1.I am not able to obtain the inference of 23 fps mentioned in the paper so i tried a couple of things you mentioned in previous comments

  1. removing voting feature i.e i tried the evaluate function i did not get the timing obtained
  2. can you please share the method or process to obtain the inference as mentioned in the paper 
  3. You hade mentioned in the paper your not using any kind of pre processing or post processing methods but in data_prepare your using grid sampling during inference also is not a pre processing method, did you include this during your timing calculations
  4. what is the use of this variable " num_per_epoch = int(len(self.test_list) / cfg.val_batch_size) cfg.val_batch_size 4" Thanks in advance
QingyongHu commented 4 years ago

Hi @abhigoku10 , sorry for the late response, as I was over-busy these days. For question 1-3: First, different from the fixed-size image, the number of points in each scan is different. Therefore, it is not easy to directly evaluate the fps since the points number in each batch data is different. To this end, we feed the same number of points (i.e., 81920) from each scan into each neural network in the paper. Therefore, the averaged 23 frames per second is a rough but reasonable indication for our RandLA-Net compared with other approaches. The released code is used to generate the results for submission to the SemanticKITTI dataset, therefore, each point has been evaluated multiple times and a voting scheme is used to get a better result. However, the voting scheme in practice is definitely not mandatory. You are free to use the trained model to inference the whole scan so that you do not need to use any voting scheme, since our network has good scalability. It is not difficult to implement. In addition, in the test mode, it takes less than 1 seconds to process val_batch_size (20) num_points (409611) in an RTX 2080TI GPU.
(4) Again, grid_sampling is not mandatory for our framework, you can directly input the raw point clouds to our network. To clarify, in order to generate data for training, we do need to build a kdtree to generate training batch, but our network is able to inference the entire point cloud during inference, so building a tree is also not mandatory in practice.

  1. Please refer to this issue #23 and #11
abhigoku10 commented 4 years ago

@QingyongHu Thanks for your detailed reponse , following which i have certain queries

Q1. "we feed the same number of points (i.e., 81920) from each scan into each neural network in the paper."--- if i understand this , for semantic kitti each frame ~1 x10^6 points most of the time 81920 are loaded after grid subsampling =0.06 ? Q2." each point has been evaluated multiple times and a voting scheme is used to get a better result. However, the voting scheme in practice is definitely not mandatory."-- given a frame of batch size =1 the frame get evaluated multiple time i.e 4 from (num_epoch variable value), to remove the voting scheme i simply started using def evaluate(self, dataset): function in Randlet.py is it the right approach Q3."You are free to use the trained model to inference the whole scan so that you do not need to use any voting scheme, since our network has good scalability. It is not difficult to implement. " --- so ur suggesting we can build our own inference code with only the model right Q4"in the test mode, it takes less than 1 seconds to process val_batch_size (20) num_points (409611) in an RTX 2080TI GPU."--- so your getting inference about 23fps with the following settings right ? since i got the following results val_batch_size (1) num_points (409611) in an RTX 2080TI GPU= 3.6 sec with Model Testing time– test()=110.6580228805542 sec for steps of 200 so how can i get a time of 23fps with the current code Q4. With the current code during the inference of one frame in inference() function, spatial_gen() is called every epoch’s step due to which model inference is also high. How to remove this function dependability

THanks in advance