PRBonn / LiDiff

[CVPR'24] Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion
MIT License
207 stars 18 forks source link

Point cloud input number #10

Closed Yacovitch closed 4 months ago

Yacovitch commented 4 months ago

Hi again,

In my understanding, scans in SemanticKITTI have an average 480,000 points, but in your code and configurations, you used 180000 points for aggregated input p_full and 18000 points for p_part. Suppose I understand correctly, aggregated point cloud (from map_clean) will have a lot more points, and even considering filtering out moving objects and filtering points based on maximum range. In that case, the current configuration requires extreme downsampling. Is there any reason you picked 180000 as your num_points and 1/10 as your downsampling rate for p_part?

nuneslu commented 4 months ago

Hi, yes we do downsampling from the ground truth. The main reason to downsample to 180000 points is to fit it into GPU memory, otherwise the training and the denoising inference would just be intractable. That is why we use the refinement network to upsample the diffusion prediction afterward.

nuneslu commented 4 months ago

But also, usually the point cloud from SemanticKITTI do have around 180000 points as well.