ethz-asl / TULIP

🌷 TULIP: Transformer for Upsampling of LiDAR Point Clouds (CVPR 2024)
MIT License
24 stars 3 forks source link

Questions about converting range image to point cloud #4

Open sz3623 opened 1 month ago

sz3623 commented 1 month ago

2

Hello, there are some problems in converting the upsampled range image to point cloud.

The generated point cloud has many noise points in the center. The pixel values ​​at some locations of the image are very small (close to zero), such as the image boundary area. However, the pixel values ​​at these locations should be equal to 0. Therefore, noise appears after these locations are converted to point clouds.

Do you also encounter this problem? Our training loss is close to 0.01. We speculate that the training has not converged. Setting thresholds and filtering is also a solution.

Thank you for your attention and look forward to your reply.

binyang97 commented 4 weeks ago

2

Hello, there are some problems in converting the upsampled range image to point cloud.

The generated point cloud has many noise points in the center. The pixel values ​​at some locations of the image are very small (close to zero), such as the image boundary area. However, the pixel values ​​at these locations should be equal to 0. Therefore, noise appears after these locations are converted to point clouds.

Do you also encounter this problem? Our training loss is close to 0.01. We speculate that the training has not converged. Setting thresholds and filtering is also a solution.

Thank you for your attention and look forward to your reply.

Hi,

yes, I've also had the issue. That's mainly because the network tends to predict something even if the pixels are invalid (no projected points) in the ground truth. You can clamp the prediction with a minimum range, something like 3 or 2 meters, to filter out those noisy points. That would make the range image look much cleaner. Another option is to let the network learn the projection mask at the same time and for example, use binary cross entropy loss to optimize the network. The final output is then the combination of predicted mask and range image. I have not tried this yet but I think it is also a feasible solution.

Best, Bin