drprojects / superpoint_transformer

Official PyTorch implementation of Superpoint Transformer introduced in [ICCV'23] "Efficient 3D Semantic Segmentation with Superpoint Transformer" and SuperCluster introduced in [3DV'24 Oral] "Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering"
MIT License
508 stars 65 forks source link

Empty list on Scatter_nearest_neighbor #115

Closed billy-antoine closed 3 weeks ago

billy-antoine commented 1 month ago

Hello,

First of all, thank you for your amazing work. I am trying to train the model on my own data. I am running on WSL Ubuntu 22.04.4, and the installation went well.

I have sorted my point clouds into several folders (train, val and test) and created a datamodule.yaml, a config.py and a dataset.py based on dales templates. My point clouds are in .laz format so I just made a little change for the read_tile function.

I have an error in the scatter_nearest_neighbor function that I can't fix. It seems that they don't find any neighbor to a given point :

File "/home/abilly/dev/superpoint_transformer/src/utils/scatter.py", line 138, in scatter_nearest_neighbor candidate = torch.cat([elt[0] for elt in out_list], dim=0) RuntimeError: torch.cat(): expected a non-empty list of Tensors

image

I cleaned my point clouds to remove isolated points but this didn't seem to change anyting.. I saw that https://github.com/drprojects/superpoint_transformer/issues/36 suggested to modify parameters in the .yaml, but I'm not sure of the one I can play with.

Do you think you can help me? Don't hesitate if you need more info.

drprojects commented 1 month ago

Hi @billy-antoine, thanks for your interest in our project.

A bit of context: scatter_nearest_neighbor is not computing distance between points, but between superpoints. From what I see, the error comes from the fact that your edge_index is empty.

I am guessing this error occurs at preprocessing time when calling RadiusHorizontalGraph ? If so, have you tried increasing graph_gap as described in #36 ?

PS: If you ❤️ or use this project, don't forget to give it a ⭐, it means a lot to us !

billy-antoine commented 1 month ago

Hi @drprojects, thank you for your quick answer!

You are correct, the error occurs during the preprocessing, here is the full log if it might help you understand :

Error executing job with overrides: ['experiment=semantic/viamapa']
Traceback (most recent call last):
  File "/home/abilly/dev/superpoint_transformer/src/train.py", line 139, in main
    metric_dict, _ = train(cfg)
  File "/home/abilly/dev/superpoint_transformer/src/utils/utils.py", line 48, in wrap
    raise ex
  File "/home/abilly/dev/superpoint_transformer/src/utils/utils.py", line 45, in wrap
    metric_dict, object_dict = task_func(cfg=cfg)
  File "/home/abilly/dev/superpoint_transformer/src/train.py", line 114, in train
    trainer.fit(model=model, datamodule=datamodule, ckpt_path=cfg.get("ckpt_path"))
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 947, in _run
    self._data_connector.prepare_data()
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 94, in prepare_data
    call._call_lightning_datamodule_hook(trainer, "prepare_data")
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 179, in _call_lightning_datamodule_hook
    return fn(*args, **kwargs)
  File "/home/abilly/dev/superpoint_transformer/src/datamodules/base.py", line 144, in prepare_data
    self.dataset_class(
  File "/home/abilly/dev/superpoint_transformer/src/datasets/base.py", line 223, in __init__
    super().__init__(root, transform, pre_transform, pre_filter)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/data/in_memory_dataset.py", line 57, in __init__
    super().__init__(root, transform, pre_transform, pre_filter, log)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/data/dataset.py", line 97, in __init__
    self._process()
  File "/home/abilly/dev/superpoint_transformer/src/datasets/base.py", line 647, in _process
    self.process()
  File "/home/abilly/dev/superpoint_transformer/src/datasets/base.py", line 682, in process
    self._process_single_cloud(p)
  File "/home/abilly/dev/superpoint_transformer/src/datasets/base.py", line 710, in _process_single_cloud
    nag = self.pre_transform(data)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/transforms/compose.py", line 24, in __call__
    data = transform(data)
  File "/home/abilly/dev/superpoint_transformer/src/transforms/transforms.py", line 23, in __call__
    return self._process(x)
  File "/home/abilly/dev/superpoint_transformer/src/transforms/graph.py", line 656, in _process
    nag = _horizontal_graph_by_radius(
  File "/home/abilly/dev/superpoint_transformer/src/transforms/graph.py", line 760, in _horizontal_graph_by_radius
    nag = _horizontal_graph_by_radius_for_single_level(
  File "/home/abilly/dev/superpoint_transformer/src/transforms/graph.py", line 814, in _horizontal_graph_by_radius_for_single_level
    edge_index, distances = cluster_radius_nn_graph(
  File "/home/abilly/dev/superpoint_transformer/src/utils/neighbors.py", line 488, in cluster_radius_nn_graph
    anchors = scatter_nearest_neighbor(
  File "/home/abilly/dev/superpoint_transformer/src/utils/scatter.py", line 138, in scatter_nearest_neighbor
    candidate = torch.cat([elt[0] for elt in out_list], dim=0)
RuntimeError: torch.cat(): expected a non-empty list of Tensors

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

I indeed tried to play with graph_gap, but I'm not sure how much can I increase the values. I went to 10 times the original bvalues but the result is still the same. My clouds are quite sparse as they are acquired from satellites :

image

I gave my best star, thanks for the reminder ⭐

drprojects commented 1 month ago

Maybe your point density is much lower than for DALES and your hierarchical superpoint partitions are not so good.

Have you tried visualizing your superpoint partitions ?

As a rule of thumb, you should parameterize the partition so that $P_1$ contains ~30-50 fewer elements than $P_0$. Then for $P_2$ you want another reduction factor of ~3-10. There is no official golden rule for this, but you may want to aim for these ranges and then explore the impact of other parametrizations on your specific dataset.

Also, you want to be wary of clouds with only 1 superpoint in $P_1$ or $P_2$.

From there, you should further investigate which cloud causes the error you are encountering

billy-antoine commented 1 month ago

Hi again,

Removing RadiusHorizontalGraph allowed me to go further into the process, but it still crashes before the preprocessing finishes. The error occurs while calling grid_cluster dring the voxelization by GridSampling3D.

No matter what point cloud I use, there is always a time during the voxelisation where my data seems to be empty :

image

This generates the following error:

Processing...
 11%|█         | 2/18 [00:47<06:16, 23.51s/it]
[2024-06-05 16:11:46,262][src.utils.utils][ERROR] - 
Traceback (most recent call last):
  File "/home/abilly/dev/superpoint_transformer/src/utils/utils.py", line 45, in wrap
    metric_dict, object_dict = task_func(cfg=cfg)
  File "/home/abilly/dev/superpoint_transformer/src/train.py", line 114, in train
    trainer.fit(model=model, datamodule=datamodule, ckpt_path=cfg.get("ckpt_path"))
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 947, in _run
    self._data_connector.prepare_data()
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 94, in prepare_data
    call._call_lightning_datamodule_hook(trainer, "prepare_data")
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 179, in _call_lightning_datamodule_hook
    return fn(*args, **kwargs)
  File "/home/abilly/dev/superpoint_transformer/src/datamodules/base.py", line 144, in prepare_data
    self.dataset_class(
  File "/home/abilly/dev/superpoint_transformer/src/datasets/base.py", line 223, in __init__
    super().__init__(root, transform, pre_transform, pre_filter)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/data/in_memory_dataset.py", line 57, in __init__
    super().__init__(root, transform, pre_transform, pre_filter, log)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/data/dataset.py", line 97, in __init__
    self._process()
  File "/home/abilly/dev/superpoint_transformer/src/datasets/base.py", line 647, in _process
    self.process()
  File "/home/abilly/dev/superpoint_transformer/src/datasets/base.py", line 682, in process
    self._process_single_cloud(p)
  File "/home/abilly/dev/superpoint_transformer/src/datasets/base.py", line 710, in _process_single_cloud
    nag = self.pre_transform(data)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/transforms/compose.py", line 24, in __call__
    data = transform(data)
  File "/home/abilly/dev/superpoint_transformer/src/transforms/transforms.py", line 23, in __call__
    return self._process(x)
  File "/home/abilly/dev/superpoint_transformer/src/transforms/sampling.py", line 151, in _process
    cluster = grid_cluster(coords, torch.ones(3, device=coords.device))
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/torch_cluster/grid.py", line 34, in grid_cluster
    return torch.ops.torch_cluster.grid(pos, size, start, end)
  File "/home/abilly/miniconda3/envs/spt/lib/python3.8/site-packages/torch/_ops.py", line 755, in __call__
    return self._op(*args, **(kwargs or {}))
RuntimeError: cannot reshape tensor of 0 elements into shape [0, -1] because the unspecified dimension size -1 can be any value and is ambiguous

I have seen the visualization tool, it works fine on the demo data, but I have the same error with my cloud during the preprocessing, which does not allow me to generate the NAG and look at my superpoint partitions...

Thank you again for your time, I hope I will be able to make it work without bothering you too much!

drprojects commented 1 month ago

This means that your point cloud is empty, this is not a GridSampling3D error. You need to thoroughly investigate that all your point clouds contain points. In particular, check the output of read_single_raw_cloud() for all your datasets.

billy-antoine commented 1 month ago

Hi again,

Thank you once more for your time. None of my clouds seems to be empty. My dataset is very small so it is easy to check. The read_single_raw_cloud() seems to work fine as it returns the following before crashing:

Data(pos=[2660907, 3], pos_offset=[3], intensity=[2660907], y=[2660907], obj=InstanceData(num_clusters=2660907, num_overlaps=2660907, num_obj=1, device=cpu))

I tried with multiple clouds, even with a whole new dataset with more classical points architecture and the error remains the same. Here is an example of a cloud that generates the error:

image

drprojects commented 1 month ago

What is your voxel size ? Can you try isolating (ie Data.save() somewhere to disk) the problematic cloud's Data object and Data.load() it in a script/notebook to simply run GridSampling3D on it ?

noisyneighbour commented 1 month ago

I ran into the same error as you, and it turned out to be related to the data. Try to print out the shapes of your nag attributes to inspect how many superpoints there are at various points of the transform chain in the 3rd hierarchy level especially.

drprojects commented 3 weeks ago

@billy-antoine have you investigated your point cloud and partition sizes and solved your issue ? May I close it ?

drprojects commented 3 weeks ago

I am considering this issue solved. @billy-antoine feel free to reopen it if your problem persists after investigation of your data.