Closed acardaras-sanborn closed 1 year ago
Hi @acardaras-sanborn
Thanks for using this project !
To be more specific, if you have edge_index
of shape (2, 0)
at this point in the preprocessing, it means that, in one of your partition levels, all superpoints are isolated. Off the top of my head, this could mean two things:
graph_gap
as you suggested (be sure to adjust the values for each partition level, higher levels need larger gaps)I am thinking you might have encountered the second issue. To validate this hypothesis, you should check num_nodes
attribute of your NAG
objects after CutPursuitPartition
.
If you do encounter this issue, it means you should adjust the partition parameters. It is very likely you will need to do so if your dataset has a lower resolution than DALES. Especially for partition levels 2 and higher. To adjust your partition, you can play with the following:
voxel: 0.1
knn: 25
knn_r: 10 # may try increasing if your resolution is much lower than DALES
pcp_regularization: [0.1, 0.2, 0.3] # decrease for higher sensitivity to changes in local features (ie more superpoints)
pcp_spatial_weight: [1e-1, 1e-2, 1e-3] # increase for higher sensitivity to position (ie more superpoints)
pcp_cutoff: [10, 30, 100] # try reducing as last resort if all above do not help
You will need to find adequate settings suiting your dataset's resolution and classes of interest (the size of your smallest objects of interest matters when parameterizing the partition, think Nyquist-Shannon theorem). One way of measuring if a partition is "good-enough" is to track the ratio of superpoints vs points (you want to reduce scene complexity to save compute and memory) and measure the oracle mIoU you would get by assigning each superpoint to its dominant label (you want your superpoints to be semantically pure, as a rough rule of thumb, an mIoU of 90% is ok-ish for noisy dataset, while 95% and higher is safe).
Please let me know if that helped and if I can close this issue :wink:
PS: if you are interested in this project, don't forget to give our project a :star:, it matters to us !
Thank you for the suggestions.
As per your second suggestion, I've tried printing the nag.num_nodes
attribute at the end of the _process
function in CutPursuitPartition
, but it appears to be missing:
print(nag.num_nodes)
AttributeError: 'NAG' object has no attribute 'num_nodes'
Instead, I printed the nag
object itself and it contains a num_points attribute with 4 levels filled with points.
print(nag) # NAG(num_levels=4, num_points=[929760, 928150, 227981, 25694], device=cpu)
I am double checking my dataset to see if there were any errors during the las to ply conversion. I will follow up within 1-2 days with more detailed results regarding changes to the graph_gap
parameter and the other parameters you've listed.
Hi, yes sorry, I meant num_points
, but anyways printing nag
was another way of checking what we wanted.
As you can see, you have almost as many points in $P_0$ (929760) as you have superpoints in your first level of partition $P_1$ (928150). This is bad news, you want the partition to simplify the scene. See Table 1 and Appendix A8 in our paper for suggestions on how to parameterize your partition.
However, I was fearing your higher-level $P_2$ might be too coarse, with only one superpoint, but it does not seem to be the case.
Once you have (roughly) parameterized your partition to ensure it both simplifies the scene and respects the semantics, you can play with graph_gap
for constructing the superpoint graph. As a rule of thumb, you can aim for graph_gap
parameters that ensures each superpoint has (roughly) 30 neighbors. Again, this is a rough suggestion, you can try training with a few different settings and pick what is best for your data.
Thank you for helping; I was able to resolve the error by fixing some dataset-specific problems as well as lowering xy_tiling from 3 to 1.
I'm not sure of the specifics of the xy_tiling parameter, but some ply files were poorly tiled causing the IndexError.
Glad you could solve your problem !
FYI:
datamodule.xy_tiling
: Splits dataset tiles into xy_tiling^2 smaller tiles, based on a regular XY grid. Ideal square-shaped tiles à la DALES. Note this will affect the number of training steps. datamodule.pc_tiling
: Splits dataset tiles into 2^pc_tiling smaller tiles, based on a their principal component. Ideal for varying tile shapes à la S3DIS and KITTI-360. Note this will affect the number of training steps.@drprojects I have a few follow-up questions regarding parameterization.
you want your superpoints to be semantically pure, as a rough rule of thumb, an mIoU of 90% is ok-ish for noisy dataset, while 95% and higher is safe.
Does this rule of thumb refer to the first partition level?
Yes, this is for the first level. For the above levels, you still want to reduce the scene complexity and maintain decent oracle mIoU, but with more tolerance. See Table 1 and Appendix A8 in our paper for suggestions on how to parameterize your partition.
How should I go about measuring the mIoU during dataset creation?
See 4.3.d in our paper for the oracle prediction. We do not provide the implementation for this. You simply need to compute the mIoU of a "fake" oracle prediction where point predictions are the dominant label in each class. This is quite easy to implement using:
# Get the dominant class in each superpoint from its histogram of labels
superpoint_pred = nag[1].y.argmax(dim=1)
# Distribute the superpoint predictions to the points
point_pred = superpoint_pred[nag[0].super_index]
# The ground truth point labels you want to evaluate against
point_target = nag[0].y
# Compute mIoU on point_pred vs point_target
# I leave the res to you ;)
Hello, I have a custom dataset of .las files that I've converted to .ply with x, y, z, intensity, and sem_class attributes. The dataset is similar to DALES3D but has lower average points per meter and covers a much larger area. The ground points are non-planar, so I've followed the suggestion in #32 to remove 'elevation' from
partition_hf
andpoint_hf
in my config (I also removed the GroundElevation pre_transform).When running
src/train.py
on my dataset/config I get an Index Error (see stack trace below) inside the RadiusHorizontalGraph transform. This error is directly related to the edge_index variable insrc/utils/neighbors.py
cluster_radius_nn()
, which when printed out returns a tensor with size (2, 0). I believe this empty tensor suggests that there exist clusters without nearby neighbors.I've attempted to solve this issue by increasing the config's
graph_gap
value but did not succeed in getting past the IndexError. I also tried scaling the XYZ points by a constant factor (before runningsrc/train.py
) to increase the points per meter. The scaling was met with partial success allowing me to train on a small subset of my data (+datamodule.mini=True), but some files still produced an IndexError.Any suggestions on how I should proceed would be greatly appreciated. Perhaps I need to modify additional values inside my config.
Thanks for the amazing repo!