Suggest object location in 3D space as a mean of clustered points instead of point in the center of bounding box

douglasrizzo / dodo_detector_ros

Object detection from images/point cloud using ROS

BSD 3-Clause "New" or "Revised" License

54 stars 11 forks source link

Suggest object location in 3D space as a mean of clustered points instead of point in the center of bounding box #5

Open christenbc opened 5 years ago

christenbc commented 5 years ago

Is taking the location of the point in the point cloud corresponding to the center pixel of the bounding box? Or do you rather apply any mean to the center pixels? What method do you use?

douglasrizzo commented 5 years ago

I ask sensor_mgs.pointcloud2 for the point that is in the (x, y) center of the object's bounding box, which returns me a point with (x, y, z) coordinates that I can use to create a tf. This is done in this line.

This has the caveat that if your object has a hollow center, e.g. it's a donut or maybe a person in a strange stance (like this), then the point that is selected doesn`t reflect the position of the actual object.

douglasrizzo commented 5 years ago

In that line I referenced, it's possible to ask for more points from the point cloud and take their mean in order to place the object in 3D space. The problem is, if there are other things inside the object's bounding box, there is the possibility that many points inside the bounding box do not come from the detected object. You can have a wall 6 feet behind the object and most points would be sampled from the wall, placing the object further back than it actually is.

Maybe if we apply some clustering method to the returned points and only take the mean of the points from one cluster, either the largest or closest one...

christenbc commented 5 years ago

Yeah indeed, taking the central pixel is a rough but valid approach! Your proposal looks quite interesting, I would suggest Mean Shift Clustering for this case. Thanks for the elaborated answers Douglas.

douglasrizzo commented 5 years ago

Thanks for the suggestion. I believe DBSCAN may show interesting results, too. Both algorithms are available in scikit-learn and sooner or later I may try them. I'll use this issue to remember me in the future.