tkm-n commented 1 year ago

Hi, thank you for sharing the code of your promising paper! I want to try the code with my own dataset, but I got an error during preprocess. When does this AssertionError happen? I believe I made a custom loader to load my dataset referring to s3dis's loader, but it or my own dataset seem to be causing the error. It seems properly to load dataset for a while during the preprocess, but stops in the middle of it. The following is the traceback or the error. Could you tell me what the cause would be for this error?

Traceback (most recent call last):
  File "src/train.py", line 139, in main
    metric_dict, _ = train(cfg)
  File "/home/ubuntu/superpoint_transformer_custom/src/utils/utils.py", line 48, in wrap
    raise ex
  File "/home/ubuntu/superpoint_transformer_custom/src/utils/utils.py", line 45, in wrap
    metric_dict, object_dict = task_func(cfg=cfg)
  File "src/train.py", line 114, in train
    trainer.fit(model=model, datamodule=datamodule, ckpt_path=cfg.get("ckpt_path"))
  File "/home/ubuntu/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 529, in fit
    call._call_and_handle_interrupt(
  File "/home/ubuntu/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 42, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/ubuntu/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 568, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/ubuntu/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 925, in _run
    self._data_connector.prepare_data()
  File "/home/ubuntu/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 94, in prepare_data
    call._call_lightning_datamodule_hook(trainer, "prepare_data")
  File "/home/ubuntu/miniconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 164, in _call_lightning_datamodule_hook
    return fn(*args, **kwargs)
  File "/home/ubuntu/superpoint_transformer_custom/src/datamodules/base.py", line 144, in prepare_data
    self.dataset_class(
  File "/home/ubuntu/superpoint_transformer_custom/src/datasets/custom.py", line 240, in __init__
    super().__init__(*args, val_mixed_in_train=True, **kwargs)
  File "/home/ubuntu/superpoint_transformer_custom/src/datasets/base.py", line 193, in __init__
    super().__init__(root, transform, pre_transform, pre_filter)
  File "/home/ubuntu/miniconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/data/in_memory_dataset.py", line 57, in __init__
    super().__init__(root, transform, pre_transform, pre_filter, log)
  File "/home/ubuntu/miniconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/data/dataset.py", line 97, in __init__
    self._process()
  File "/home/ubuntu/superpoint_transformer_custom/src/datasets/base.py", line 493, in _process
    self.process()
  File "/home/ubuntu/superpoint_transformer_custom/src/datasets/base.py", line 528, in process
    self._process_single_cloud(p)
  File "/home/ubuntu/superpoint_transformer_custom/src/datasets/base.py", line 576, in _process_single_cloud
    nag.save(
  File "/home/ubuntu/superpoint_transformer_custom/src/data/nag.py", line 236, in save
    data.save(
  File "/home/ubuntu/superpoint_transformer_custom/src/data/data.py", line 572, in save
    save_dense_to_csr(val, sg, fp_dtype=fp_dtype)
  File "/home/ubuntu/superpoint_transformer_custom/src/utils/io.py", line 111, in save_dense_to_csr
    pointers, columns, values = dense_to_csr(x)
  File "/home/ubuntu/superpoint_transformer_custom/src/utils/sparse.py", line 53, in dense_to_csr
    pointers = indices_to_pointers(index[0])[0]
  File "/home/ubuntu/superpoint_transformer_custom/src/utils/sparse.py", line 18, in indices_to_pointers
    assert is_dense(indices), "Indices must be dense"
AssertionError: Indices must be dense

drprojects commented 1 year ago

Hi @kotetsu-n, thanks for your interest in this project !

Quick answer

It is likely that some voxels in your Data object have no labels in their label histogram y, a way of testing it:

assert a.sum(dim=1).gt(0).all(), f"Some points in the label histogram `self.y` do not have any labels.

place this :point_up: in here to check if this hypothesis is correct.

Detailed answer

The error stems when trying to save the labels of your preprocessed NAG. More precisely, it is when trying to save your Data to disk (for details on the Data and NAG data structures used in this project, please have a look at the docs and the extensive docstrings and comments in the code of the relevant classes). Even more precisely, the Data objects hold the semantic annotations in their y attribute as a 2D histogram of shape (num_nodes, num_classes + 1), where num_nodes is the number of nodes in your Data object and num_classes is the number of classes in your dataset. These are histograms and not simply 1D tensors holding a single label per node, because we want to keep track of all labels in your raw point cloud, even after voxelization, and superpoint partition (this allows computing exact full-resolution metrics and losses without loading the entire raw point clouds).

Now, when saving these 2D histograms, we convert them to CSR format to save some disk space and I/O speed. The error you are encountering is happening when converting a 2D histogram to CSR format.

One reason I can think of for things to go wrong at this point is that, somehow, one of your points/voxels/superpoints (depends on the partition level in your NAG) has no labels at all in its histogram.

I don't know how this could happen, but it definitely comes from your dataset reader. Can you think of a reason why a point would not have any labels ?

Have you followed the guidelines for creating your own dataset ?

An important rule (that I will make clearer in future releases) for creating your own dataset is that your points must have labels within $[0, C]$, where:

$C$ is your BaseDataset's num_classes
giving the $C$ label to a point marks it as void/ignored/unlabeled (whichever you call it) and will exclude it from metrics and losses computation
all labels $[0, C - 1]$ are assumed to be present in your dataset and will be used in metrics and losses computation

To this end, I recommend you make sure the output of your read_single_raw_cloud method never has labels outside of your $[0, C]$ range. Besides, if some labels in $[0, C - 1]$ are not useful to you (ie absent from your dataset), I recommend you densely remap your labels to another $[0, C_2 - 1]$ range (you can use torch_geometric.nn.pool.consecutive.consecutive_cluster for that, for instance), while making sure you only use the label $C_2$ for void/ignored/unlabeled points.

Best,

Damien

tkm-n commented 1 year ago

Hi @drprojects, thank you for your detailed answer! I really appreciate your help. As you wrote, I figured out my data reader was wrong. It reads labels in the wrong manner in terms of its dimension. Now I successfully could run training, and get a nice result! Thanks!

drprojects commented 1 year ago

Awesome ! I am glad to hear that you could deploy SPT on a new dataset !

By the way, if your dataset is open and you were willing to share your code, we would gladly accept pull request to integrate it in the project :wink:

drprojects / superpoint_transformer

Got error "Indices must be dense" using my own dataset #15

Quick answer

Detailed answer