Question about yaml config file

jac99 commented 7 months ago

Thanks for publishing HeLiPR dataset and the related toolbox. It'll be very useful for lidar-based place recognition research. I've got a few questions about about a yaml config file parameters.

What is the meaning of "numIntervals" parameter in Undistort section? Is there any recommended value I should use? Can I use the same value for each LiDAR type, or is it recommended to use different value for each LiDAR?
Does the downsamplePointSize parameter control the number of points in the resultant (output) point cloud?
What is the meaning of distanceThreshold parameter? If we set if to e.g. 5, does it mean that resultant (output) point clouds will be sampled at 5m intervals?

Is it possible to share yaml configuration files used to generate point clouds for evaluation of place recognition methods described here: https://sites.google.com/view/heliprdataset/place-recognition? I'd like to evaluate other methods on helipr dataset and I'd like to ensure I use the same configuration.

minwoo0611 commented 7 months ago

Dear @jac99,

Thank you for reaching out and showing interest in the HeLiPR dataset and toolbox. I'll address your questions regarding the YAML config file parameters directly:

The "numIntervals" parameter in the Undistort section refers to the number of time intervals used to linearly interpolate transformations for lidar point cloud undistortion within a LiDAR sweep duration (typically 0.1 seconds). A higher value of "numIntervals" means more accurate undistortion but at the cost of higher computational complexity. Although the sweep duration is consistent across different LiDAR models, making it theoretically feasible to use the same "numIntervals" value, we found that 1000 intervals offer high accuracy. However, based on further analysis, a value of 250 intervals is considered sufficient for a good balance between accuracy and computational efficiency. This recommendation applies universally across different LiDAR sensors.
Yes, the "downsamplePointSize" parameter indeed controls the size of the output point cloud. It's designed to standardize the point cloud size, which is particularly beneficial for learning methods requiring a consistent number of points. For our experiments, we did not apply downsampling (hence, a recommended size of 0), relying instead on voxelization to manage point cloud density. This approach is similar to that used in PointNetVLAD, ensuring that the primary structure of the point cloud is preserved while potentially reducing computational requirements.
The "distanceThreshold" parameter determines the minimum distance movement required for a new point cloud to be considered distinct and thus captured in the dataset. Setting this threshold to a value like 5 meters means that the LiDAR system needs to move at least 5 meters before recording another point cloud. This is particularly useful for avoiding redundant data collection in scenarios where the LiDAR is stationary or moving minimally, thus optimizing the dataset for meaningful variability.

Furthermore, they were in the original configuration.

Undistort:
  numIntervals: 1000 # but 250 is ok.
  undistortFlag: True

Save:
  downSampleFlag: True    
  downSampleVoxelSize: 0.2   # we may use 0.2, but there are possibility to use 0.4
  downsamplePointSize: 0 # No downsampling of points, instead rely on voxelization
  normalizeFlag: False        

  saveAs: "pcd"       # "bin" or "pcd", it does not affect the experiment results
  saveName: "Timestamp"   # "Index" or "Timestamp", it does not affect the experiment results

  cropFlag: True 
  cropSize: 100 # meter, -100 ~ 100

  LiDAR: 0 # Different for each lidar
  distanceThreshold: 10 or 5 # query or database, adjust based on requirements

  accumulatedSize: 20   # For Livox LiDAR (In case of STD, for all LIDARs)
  accumulatedStep: 1    # Increment step for accumulation

We are committed to supporting research and development in the field and would likely be more than willing to share specific configurations and additional resources to facilitate your work.

I hope these answers provide the clarity you need to effectively utilize the HeLiPR dataset for your research. Should you have any more questions or need further assistance, feel free to reach out.

Best regards, Minwoo

jac99 commented 7 months ago

Hi Minwoo, thank you very much for a quick and detailed answer. Everything is clear. I really appreciate your help.

minwoo0611 / HeLiPR-Pointcloud-Toolbox

Question about yaml config file #2