Custom Dataset Index Error

acardaras-sanborn commented 1 year ago

Hello, I have a custom dataset of .las files that I've converted to .ply with x, y, z, intensity, and sem_class attributes. The dataset is similar to DALES3D but has lower average points per meter and covers a much larger area. The ground points are non-planar, so I've followed the suggestion in #32 to remove 'elevation' from partition_hf and point_hf in my config (I also removed the GroundElevation pre_transform).

When running src/train.py on my dataset/config I get an Index Error (see stack trace below) inside the RadiusHorizontalGraph transform. This error is directly related to the edge_index variable in src/utils/neighbors.py cluster_radius_nn(), which when printed out returns a tensor with size (2, 0). I believe this empty tensor suggests that there exist clusters without nearby neighbors.

I've attempted to solve this issue by increasing the config's graph_gap value but did not succeed in getting past the IndexError. I also tried scaling the XYZ points by a constant factor (before running src/train.py) to increase the points per meter. The scaling was met with partial success allowing me to train on a small subset of my data (+datamodule.mini=True), but some files still produced an IndexError.

Any suggestions on how I should proceed would be greatly appreciated. Perhaps I need to modify additional values inside my config.

Thanks for the amazing repo!

  File "/home/acardaras/PycharmProjects/Superpoint/superpoint_transformer/src/datasets/base.py", line 493, in _process
    self.process()
  File "/home/acardaras/PycharmProjects/Superpoint/superpoint_transformer/src/datasets/base.py", line 528, in process
    self._process_single_cloud(p)
  File "/home/acardaras/PycharmProjects/Superpoint/superpoint_transformer/src/datasets/base.py", line 559, in _process_single_cloud
    nag = self.pre_transform(data)
  File "/home/acardaras/anaconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/transforms/compose.py", line 24, in __call__
    data = transform(data)
  File "/home/acardaras/PycharmProjects/Superpoint/superpoint_transformer/src/transforms/transforms.py", line 23, in __call__
    return self._process(x)
  File "/home/acardaras/PycharmProjects/Superpoint/superpoint_transformer/src/transforms/graph.py", line 649, in _process
    nag = _horizontal_graph_by_radius(
  File "/home/acardaras/PycharmProjects/Superpoint/superpoint_transformer/src/transforms/graph.py", line 753, in _horizontal_graph_by_radius
    nag = _horizontal_graph_by_radius_for_single_level(
  File "/home/acardaras/PycharmProjects/Superpoint/superpoint_transformer/src/transforms/graph.py", line 807, in _horizontal_graph_by_radius_for_single_level
    edge_index, distances = cluster_radius_nn(
  File "/home/acardaras/PycharmProjects/Superpoint/superpoint_transformer/src/utils/neighbors.py", line 382, in cluster_radius_nn
    d_nn = (x_points[anchors[0]] - x_points[anchors[1]]).norm(dim=1)

drprojects commented 1 year ago

Hi @acardaras-sanborn

Thanks for using this project !

To be more specific, if you have edge_index of shape (2, 0) at this point in the preprocessing, it means that, in one of your partition levels, all superpoints are isolated. Off the top of my head, this could mean two things:

either you need to increase to graph_gap as you suggested (be sure to adjust the values for each partition level, higher levels need larger gaps)
either one of your partition levels only has one superpoint

I am thinking you might have encountered the second issue. To validate this hypothesis, you should check num_nodes attribute of your NAG objects after CutPursuitPartition.

If you do encounter this issue, it means you should adjust the partition parameters. It is very likely you will need to do so if your dataset has a lower resolution than DALES. Especially for partition levels 2 and higher. To adjust your partition, you can play with the following:

voxel: 0.1
knn: 25
knn_r: 10  # may try increasing if your resolution is much lower than DALES
pcp_regularization: [0.1, 0.2, 0.3]  # decrease for higher sensitivity to changes in local features (ie more superpoints)
pcp_spatial_weight: [1e-1, 1e-2, 1e-3]  # increase for higher sensitivity to position (ie more superpoints)
pcp_cutoff: [10, 30, 100]  # try reducing as last resort if all above do not help

You will need to find adequate settings suiting your dataset's resolution and classes of interest (the size of your smallest objects of interest matters when parameterizing the partition, think Nyquist-Shannon theorem). One way of measuring if a partition is "good-enough" is to track the ratio of superpoints vs points (you want to reduce scene complexity to save compute and memory) and measure the oracle mIoU you would get by assigning each superpoint to its dominant label (you want your superpoints to be semantically pure, as a rough rule of thumb, an mIoU of 90% is ok-ish for noisy dataset, while 95% and higher is safe).

drprojects commented 1 year ago

Please let me know if that helped and if I can close this issue :wink:

PS: if you are interested in this project, don't forget to give our project a :star:, it matters to us !

acardaras-sanborn commented 1 year ago

Thank you for the suggestions.

As per your second suggestion, I've tried printing the nag.num_nodes attribute at the end of the _process function in CutPursuitPartition, but it appears to be missing:

    print(nag.num_nodes)
AttributeError: 'NAG' object has no attribute 'num_nodes'

Instead, I printed the nag object itself and it contains a num_points attribute with 4 levels filled with points.

print(nag)  # NAG(num_levels=4, num_points=[929760, 928150, 227981, 25694], device=cpu)

I am double checking my dataset to see if there were any errors during the las to ply conversion. I will follow up within 1-2 days with more detailed results regarding changes to the graph_gap parameter and the other parameters you've listed.

drprojects commented 1 year ago

Hi, yes sorry, I meant num_points, but anyways printing nag was another way of checking what we wanted.

As you can see, you have almost as many points in $P_0$ (929760) as you have superpoints in your first level of partition $P_1$ (928150). This is bad news, you want the partition to simplify the scene. See Table 1 and Appendix A8 in our paper for suggestions on how to parameterize your partition.

However, I was fearing your higher-level $P_2$ might be too coarse, with only one superpoint, but it does not seem to be the case.

Once you have (roughly) parameterized your partition to ensure it both simplifies the scene and respects the semantics, you can play with graph_gap for constructing the superpoint graph. As a rule of thumb, you can aim for graph_gap parameters that ensures each superpoint has (roughly) 30 neighbors. Again, this is a rough suggestion, you can try training with a few different settings and pick what is best for your data.

acardaras-sanborn commented 1 year ago

Thank you for helping; I was able to resolve the error by fixing some dataset-specific problems as well as lowering xy_tiling from 3 to 1.

I'm not sure of the specifics of the xy_tiling parameter, but some ply files were poorly tiled causing the IndexError.

drprojects commented 1 year ago

Glad you could solve your problem !

FYI:

datamodule.xy_tiling: Splits dataset tiles into xy_tiling^2 smaller tiles, based on a regular XY grid. Ideal square-shaped tiles à la DALES. Note this will affect the number of training steps.
datamodule.pc_tiling: Splits dataset tiles into 2^pc_tiling smaller tiles, based on a their principal component. Ideal for varying tile shapes à la S3DIS and KITTI-360. Note this will affect the number of training steps.

acardaras-sanborn commented 1 year ago

@drprojects I have a few follow-up questions regarding parameterization.

you want your superpoints to be semantically pure, as a rough rule of thumb, an mIoU of 90% is ok-ish for noisy dataset, while 95% and higher is safe.

Does this rule of thumb refer to the first partition level?
How should I go about measuring the mIoU during dataset creation?

drprojects commented 1 year ago

Does this rule of thumb refer to the first partition level?

Yes, this is for the first level. For the above levels, you still want to reduce the scene complexity and maintain decent oracle mIoU, but with more tolerance. See Table 1 and Appendix A8 in our paper for suggestions on how to parameterize your partition.

How should I go about measuring the mIoU during dataset creation?

See 4.3.d in our paper for the oracle prediction. We do not provide the implementation for this. You simply need to compute the mIoU of a "fake" oracle prediction where point predictions are the dominant label in each class. This is quite easy to implement using:

# Get the dominant class in each superpoint from its histogram of labels
superpoint_pred = nag[1].y.argmax(dim=1)

# Distribute the superpoint predictions to the points
point_pred = superpoint_pred[nag[0].super_index]

# The ground truth point labels you want to evaluate against
point_target = nag[0].y

# Compute mIoU on point_pred vs point_target
# I leave the res to you ;)

drprojects / superpoint_transformer

Custom Dataset Index Error #36