URGENTLY !! how to deal with a dataset with variable number of points in each object?

charlesq34 / pointnet2

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Other

3.12k stars 898 forks source link

URGENTLY !! how to deal with a dataset with variable number of points in each object? #28

Open Omar-Tag opened 6 years ago

Omar-Tag commented 6 years ago

i have a dataset with variable number of points in each object , what do i need to do to train the classification network ! ,, does the pointnet++ support variable length input , or do i have to preprocess my dataset to unify the number of points?

bw4sz commented 6 years ago

@Omar-Tag can you share about your preprocessing pipeline. I'm probing the structure right now, trying to get a sense for the steps to match the desired inputs.

ZeweiXu commented 6 years ago

@bw4sz Any progress with your work? I am also curious your way of dealing with variable number of points of object. Thank you.

Omar-Tag commented 6 years ago

@bw4sz @ZeweiXu. , I have managed to set the num of points to 500 ( decided this by trial and error) , i used created some simple scripts as sampled points for smaller objects until it reach 500 just be adding a point between each point and its nearest one, and for bigger objects i just subsample the points by taking one point and neglect the next N points and so on, where N =1,2,3.... , This doesnt affect the general shape of any objects and i have applied it on object extracted from the KITTI benchmark and trained the pointnet on it. I hope this would help and i will be interested for any suggestions

pyni commented 5 years ago

scosar commented 5 years ago

@Omar-Tag can you give some details on how did you resample your data to a fixed number of points?

I am trying to train the object classifier network using KITTI dataset. In the point cloud library, I found some code to create mesh and sample from it, which results increasing the number of points. However, the number of points is still not fixed. It just varies based on the structure of the object.

Omar-Tag commented 5 years ago

@scosar check my comment above .

U should create a script for preproccesing the dataset ,For more illustration: First of all in my approach i cropped the objects from the scene using the bounding boxes annotation that is provided by the dataset then:

1 - For objects with num of points less than my desired (which was 500 points ) ,i computed the distance matrix for the all points of object and then looped over each point in the matrix , searching for the the nearest point to it ( minmum distance ) then i compute an average point that lies at the mid distance between both points( original and its nearest) simply new point will be [(x1 + x2)/2 , (y1+y2)/2 ,( z1+z2)/2 ]

2- for object with points greater than 500 , i looped over all the points and i remove 1 point every R points where R is the a ratio of num of points /desired num and keep doing this until i reach my desired number

ShiQiu0419 commented 5 years ago

I am not sure whether interpolation is a good way to do, maybe the linear relationship between the original points and the interpolated points would affect the performance.

One of the preproecessing is to extract a subsut pointcloud from the original one, that is to downsample the number of points to a suitable number. For example, you can use Farthest Point Sampling to extract a subset pointcloud with expected number of points. I guess that is how the author created ShapeNet Part Dataset HDF5 as the training set of Part Segmentation task in Pointnet, since original ShapeNet dataset has more than 17000 models with different numbers of points of each model, while the provided HDF5 dataset has only 16000 models with fixed 2048 points of each model. I suppose the models with less than 2048 points are removed.