PRBonn / lidar-bonnetal

Semantic and Instance Segmentation of LiDAR point clouds for autonomous driving
http://semantic-kitti.org
MIT License
959 stars 206 forks source link

storing for predictions for evaluation on the benchmark #8

Closed ayushais closed 5 years ago

ayushais commented 5 years ago

Hi,

To store the predictions in the desired format, I went through the code and this is what I understood:

Can you please let me know if I understand the code correctly. Thanks for the help.

tano297 commented 5 years ago

Hi,

You're on the right track. Considering N as the length of your point cloud (number of points in the original scan), p_y and p_x are a size N list of pixel coordinates for each point in the point cloud. Since I can only represent a subset of points in the range image, but the evaluation server requires predictions for each point in the same order as the input point cloud, this is how I obtain the label for each point, indexing in the range image. Basically, the proj_argmax becomes a lookup table :) What are you trying to achieve? This is already working. Maybe I can help you better if you share more details.

ayushais commented 5 years ago

Hi..thanks for the reply. I have this network for this paper. In the paper I showed results for the dataset released from squeeze seg and dataset I created from KITTI tracking and now I went to train on semantic-kitti.

Using the semantic-kitti-api, I manage to get the training data. Now I want to project the labels from the range image to the format which the benchmark expects. This is where I need help. Please let me know, the best way to achieve this.

tano297 commented 5 years ago

So you want to do this using DBLiDARNet?

The format that the benchmark expects is exactly the same format that the labels are in. I am assuming you also have an argmax image, which has a number prediction for each of the pixels in the range image. If this is the case, what you also need is an index which tells you, for each point in the point cloud, where this point is in the range image (you can also extract this from the semantic kitti api). In fact, if you move the laserscan class to your codebase it should be much easier to work with. That way, you can index the pixel position of each point in the point cloud, and get the label for that point. After that, once the labels are in integer form in an [N] shape numpy array, you can save them with a straight forward:

path = os.path.join(directory, "sequences", str(sequence_number), "predictions", label_name)
prediction_in_numpy.tofile(path)
ayushais commented 5 years ago

Yes, I want to use it with my network.

I already forked semantic-kitti-api to integrate my code base into it.

Does proj_idx (line 165) in laserscan.py will give me the data association between pixel indices and the points in the pointcloud?

after this

path = os.path.join(directory, "sequences", str(sequence_number), "predictions", label_name)
prediction_in_numpy.tofile(path)

I still need to remap the from cross_entropy to the original format?

Thanks for all the help!!

tano297 commented 5 years ago

Yes, that is what the index image is. It represents where each pixel is in the point cloud. But you are more interested in this, which is where each point is in the range image (the opposite index). Since you need a prediction for each point, proj_x and proj_y will help you achieve this. If you index your argmax image with proj_x and proj_y , you will obtain an N-shaped vector with your predictions, as I do in this framework.

As for the second question, yes, you need to provide them in the original format, not in the cross entropy format. You can use the provided script in semantic-kitti-api for this, or do it internally directly by using the dictionary as a lookup table (as I do in the parser)

ayushais commented 5 years ago

Thanks a lot!!

tano297 commented 5 years ago

I saw on semantic-kitti-api that this was solved, so closing this for now