PRBonn / semantic-kitti-api

SemanticKITTI API for visualizing dataset, processing data, and evaluating results.
http://semantic-kitti.org
MIT License
773 stars 186 forks source link

voxelize problem #142

Open lzbushicai opened 9 months ago

lzbushicai commented 9 months ago

Hello! My** velodyne and labels** data have point cloud and point cloud label files in the same format as SemanticKitti in. bin and. label formats. The following image shows the result of visualizing SemanticKitti and my dataset bin files.

sequence003_ 116

I organized my files into the format shown below and successfully generated a voxelated. bin. label. occloud. invalid file using https://github.com/jbehley/voxelizer

But I combined my voxel file and SemanticKITTI's voxel file, using https://github.com/PRBonn/semantic-kitti-api/blob/master/visualize_voxels.py Visualize it, but the effect is very strange. The left side of the image shows my data. On the right is Kitti's data

I am confident that my original bin and labels are fine. Is there a problem with my configuration? The following image shows my configuration

In addition, I also found that the sequence numbers of SemanticKITTI's voxel data are multiples of 5, indicating that SemanticKITTI ignored the first 4 frames. Can you teach me how to set them? I want to change my voxel to SemanticKITTI format. How can I modify my configuration? Can we generate voxel files with five intervals below?

jbehley commented 9 months ago

Here a possible things that you should check, such that we can rule out "easy problems":

Regarding your other questions:

lzbushicai commented 9 months ago

Here a possible things that you should check, such that we can rule out "easy problems":

  • Check if you have maybe a wrong file (you cannot just copy the KITTI file as we opted to use the KITTI way of representing poses (i.e., all poses are given in camera coordinates). If you have simply the with LiDAR poses, then you should use a with identity matrices.calib.txt``poses.txt``calib.txt
  • It would help to have not only a comparison what comes out with SemanticKITTI (i.e., having the original scan from KITTI; at least the point cloud looks like sequence 00 from KITTI), but also your data. At least that would provide an indication of what your are expecting to see.

Regarding your other questions:

  • You can generate also "dense" voxels, i.e., voxel grids for each and every scan. But due to space limitations, we decided to use only every 5th to make the download and the prediction more manageable. In the voxelizer there should be also an option to have voxel grids for each an every scan.
  • How many scans are used to aggregate for the scan is given by prior and past scans (that means how many scans should be accumulated); Even if we generate the voxel grids only for every 5th timestamp, we still aggregate all prior scans (I think it was something like 100 scans? for the ground truth). In your shown configuration, you now take "only" the last 4 before the current timestamp and the future 4 scans. I agree, the naming is a bit suboptimal (prior = before the current timestamp, past = after the current timestamp).

Thank you for your response and assistance! Your detailed answer was very helpful ! I really appreciate you taking the time to help me out. Before your reply, I did make a mistake in the meaning of "before" and "past". "I think it was something like 100 scans? For the ground truth" refers to having 100 scans prior and past, or only the prior for 100 scans ? Another question, what does the parameter "join" mean? Thank you again for your help, and I wish you a smooth work and a happy life!

jbehley commented 8 months ago

sorry, I read your issue some time ago, but forgot to answer.

Regarding your questions:

  1. According to the config (see https://github.com/jbehley/voxelizer/blob/master/assets/semantic_kitti.cfg), I used 100 scans after the current timestamp to aggregate the point clouds, and 0 prior scans. If you want to reproduce the settings for the scene completion, then you can use these config files.
  2. "join" means that it puts multiple other classes together into a single class. Thus join: [{0: [1]}] means that it maps class with id 1 (which is outlier) to class 0 (unlabeled). The other line join: [{0: [1]}, {10: [252]}, {20: [13,16,256,257,259]}, {18: [258]}, {30: [254]}, {31: [253]}, {32: [255]}] maps the "moving" to the corresponding "static" classes. It's basically mapping all things inside [] to the class before, i.e., {A: [B,C,D]} maps B,C,D to class A.

hope that helps.

lzbushicai commented 8 months ago

Thank you for your reply, your reply is very useful

lzbushicai commented 8 months ago

Hi,jbehley.In the process of using your voxelization tool, I ran into a new problem, I visualized the processed .bin and .label files, however, I found that the later frames I could no longer find where the vehicle position (where the arcs are located) was, and the script seemed to only retain the voxels in front of the vehicle

00_000003_1_1 Frames of normal somatization

00_000729_1_1 Keep an eye on the arc range, it seems to only show the voxels in front of the vehicle

00_002056_1_1 The position of the vehicle has completely disappeared, showing only the voxel directly in front of the vehicle a long way away

In order to verify if it's a problem with the internal and external parameters (after all, I only need to change these parameters), I reprojected the point cloud onto the image and found that most of the internal and external parameters are accurate, but there are some frames in the middle that are off, is this the reason for this? (If it is, it seems that my is project won't go on) 04_001994 right projection and internal 、external paramenters 03_000165 wrong projection and internal 、external paramenters

In the meantime, I've written my own scripts for voxelizing point clouds and labels

import numpy as np

# Load point cloud data and labels
points = np.fromfile(rellis_bin, dtype=np.float32).reshape(-1, 4)[:, :3]
labels = np.fromfile(rellis_label, dtype=np.uint32).astype(np.float32)

# Define the bounds and size of the voxels
x_min, x_max, y_min, y_max, z_min, z_max = 0, 51.2, -25.6, 25.6, -2, 4.4
voxel_size = 0.2

# Calculate the number of voxels along each axis
x_voxels = int((x_max - x_min) / voxel_size)
y_voxels = int((y_max - y_min) / voxel_size)
z_voxels = int((z_max - z_min) / voxel_size)

# Initialize the voxel label array
voxel_labels = np.zeros((x_voxels, y_voxels, z_voxels), dtype=np.uint32)

# Compute the voxel index for each point
indices = np.floor((points - np.array([x_min, y_min, z_min])) // voxel_size).astype(np.int32)

# Assign labels to each voxel
for i, (x, y, z) in enumerate(indices):
    if 0 <= x < x_voxels and 0 <= y < y_voxels and 0 <= z < z_voxels:
        label = labels[i]  # Corrected variable name for clarity
        voxel_labels[x, y, z] = label  # Modify here to implement label statistics and selection logic

# Save the voxel_labels array, which contains the label for each voxel; unassigned voxels have a label of 0
np.save("voxel_00_82.npy", voxel_labels)

My script visualizes the point cloud voxelized and comes out similar to the true value, using your script the difference to the true value is relatively large, but my script points are particularly sparse and nowhere near as dense as yours. image point cloud ground truth 03_000165_1_1 my scripts (without using internal 、external paramenters) 03_000005_1_1 your voxlizer tool

Here is the label distribution after processing by my script

Label    Count
0      0  2092045
1      3     1120
2      4     1804
3     11     1832
4     17       31
5     18       52
6     19      268

Here is the label distribution after processing by KITTI tool

 Label    Count
0    0.0   148154
1    3.0    66652
2    4.0   135036
3   17.0      907
4   18.0     3499
5   19.0    12389
6  255.0  1730515

By using the voxlizer tool ,how can I get the right results?