jbehley / point_labeler

My awesome point cloud labeling tool
MIT License
647 stars 161 forks source link

How to distinguish between the ground and vehicle tires points #48

Closed Chuanyu-LD closed 2 years ago

Chuanyu-LD commented 2 years ago

Hallo,

thanks for your outstanding work! I have used the tool, and find that it is difficult to distinguish between the ground and vehicle tires points. Removing ground by threshold is an option but not direct and accurate, and rotating the view to distinguish is hard.

Could you share with us your experience about how to distinguish between the ground and vehicle tires points? Thank you!

Besides, I found calib.txt plays a role even when I not need image but labling multiple frames the same time. Theoretically in this case only pose.txt matters, but using hand-made or wrong calib.txt will make the points look weird.

jbehley commented 2 years ago

Hi @Chuanyu-LD,

thanks for using our tool.

Be aware that we developed the tool with KITTI as a target dataset. Therefore, some design decisions are driven by the way KITTI represents poses, but also the point clouds. We could have chosen a more general design decision, but I made it simply work for the KITTI dataset and therefore some things are simply "hardcoded".

First, we follow the convention of the original KITTI dataset that represents the poses as camera poses. Therefore, the calib.txt is needed to convert the camera poses into LiDAR poses. Therefore, we have a hardcoded transformation that takes the Tr-pose from the calibration file and applies it to get LiDAR poses:

https://github.com/jbehley/point_labeler/blob/bf22e6f255fe5c9f01979d2d670d0ac543ae6460/src/widget/KittiReader.cpp#L387-L392

Thus, if the Tr transformation is not the identity, you will get "strange" results. As I said, it's just following the convention of KITTI poses. (Feel free to change this bit, if you have different requirements.)

Second. The big secret, how to efficiently label the tires without labeling the ground below (at least that worked most of the time quite well):

(I intentionally choose here the nuScenes data to demonstrate that it also works with different data.)

  1. Threshold the ground such that the upper parts of cars are easy to label: image

  2. Then "filter" the car points (by using Ctrl + Left click on the car label): image

  3. Remaining tire points are now often just a brush stroke: image

And the final result:

image

I usually labeled first the cars, then filtered the cars and labeled then the ground.

For moving cars, the process is a bit more involved: 1. thresholding, 2. then scan by scan tire labeling.

However, my grid-based "automatic ground removal" is only very basic and not that robust. There are nicer variants, like https://github.com/LimHyungTae/patchwork, that seems to work more reliably.

I hope that helps.

Chuanyu-LD commented 2 years ago

Thank you for the clear replying.

Indeed, if the pose.txt means the camera pose, then the calib.txt is needed to find the lidar pose.

for the labeling step Remaining tire points are now often just a brush stroke: do you mean that after filtering the top car points, the tires points will be clearly different from the ground points, for example, the tire points are often dense and the shape is like brush stroke, but ground points could be sparse? But for moving cars, dose this way still apply? because the tire could also be sparse

Chuanyu-LD commented 2 years ago

Hi I have still another question (if my understanding is wrong please correct it),

Since I find semanticKITTI includes the instances annotations, for example, the moving cars points anno in multiple frames could indicate the same car or car id.(Not tried but only literally understand the instance word)

If so, since the point labler could only label as much as possible like 500(or a little more) frames, but a sequence may include 4000 frames, does the car id is kept the same beyond different labeling process(500 frames)?

jbehley commented 2 years ago
  1. Regarding your first question about the tires. I just meant that I use the "brush" tool to label the tire points on the ground. And it's usually some points that are more "dense".
  2. The instance labels are consistent over the whole sequence. If an instance leaves a tile, I ensured that the instance get's the same instance id. There is a "join" tool for instances that assigns the same instance id for an object. That also means that the cars at the beginning of a large loop get the same instance ids when the car returns. For "static" cars this is automatically achieved, since cars are in the same tile. For moving cars, such as on a highway, I went through all tiles and "joined" the instance ids. See for instance sequence 04 (https://www.youtube.com/watch?v=2XUPkO_hzF0), the color of the axis-aligned bounding boxes for a car stays the same over the complete sequence, which means that the car has the same instance id. For long highway sequences, like sequence 20, this was a rather intensive process. (Just a side note; in the video the bounding box is always computed from all instance points in the scan, therefore, the bounding box "flickers", i.e., changes the size. Therefore, this is not really a bounding box annotation that you usually have with detection datasets, where the bounding box is "amodal", i.e., also contains the non-visible parts of the instance.)

There is always some overlap region between the tiles that you can use to identify instances at boundaries.

Chuanyu-LD commented 2 years ago

Thank you for answering, it could be closed