drprojects / superpoint_transformer

Official PyTorch implementation of Superpoint Transformer introduced in [ICCV'23] "Efficient 3D Semantic Segmentation with Superpoint Transformer" and SuperCluster introduced in [3DV'24 Oral] "Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering"
MIT License
508 stars 65 forks source link

FRNN - RuntimeError: Unknown layout #99

Closed xiarobin closed 2 months ago

xiarobin commented 2 months ago

Hello, the following error occurred while I was training the Dales Dataset with the instruction python src/train.py experiment=semantic/dales_11g. I am sure I have placed the dataset in the location described in dataset. How can I solve it? My environment is Cuda 12.1 and RTX3090.

Error executing job with overrides: ['experiment=semantic/dales_11g']
Traceback (most recent call last):
  File "src/train.py", line 167, in main
    metric_dict, _ = train(cfg)
  File "/home/user/桌面/Python/superpoint_transformer-master/src/utils/utils.py", line 48, in wrap
    raise ex
  File "/home/user/桌面/Python/superpoint_transformer-master/src/utils/utils.py", line 45, in wrap
    metric_dict, object_dict = task_func(cfg=cfg)
  File "src/train.py", line 132, in train
    trainer.fit(model=model, datamodule=datamodule, ckpt_path=cfg.get("ckpt_path"))
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 947, in _run
    self._data_connector.prepare_data()
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 94, in prepare_data
    call._call_lightning_datamodule_hook(trainer, "prepare_data")
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 179, in _call_lightning_datamodule_hook
    return fn(*args, **kwargs)
  File "/home/user/桌面/Python/superpoint_transformer-master/src/datamodules/base.py", line 144, in prepare_data
    self.dataset_class(
  File "/home/user/桌面/Python/superpoint_transformer-master/src/datasets/base.py", line 223, in __init__
    super().__init__(root, transform, pre_transform, pre_filter)
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/data/in_memory_dataset.py", line 57, in __init__
    super().__init__(root, transform, pre_transform, pre_filter, log)
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/data/dataset.py", line 97, in __init__
    self._process()
  File "/home/user/桌面/Python/superpoint_transformer-master/src/datasets/base.py", line 647, in _process
    self.process()
  File "/home/user/桌面/Python/superpoint_transformer-master/src/datasets/base.py", line 682, in process
    self._process_single_cloud(p)
  File "/home/user/桌面/Python/superpoint_transformer-master/src/datasets/base.py", line 710, in _process_single_cloud
    nag = self.pre_transform(data)
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/torch_geometric/transforms/compose.py", line 24, in __call__
    data = transform(data)
  File "/home/user/桌面/Python/superpoint_transformer-master/src/transforms/transforms.py", line 23, in __call__
    return self._process(x)
  File "/home/user/桌面/Python/superpoint_transformer-master/src/transforms/neighbors.py", line 46, in _process
    neighbors, distances = knn_1(
  File "/home/user/桌面/Python/superpoint_transformer-master/src/utils/neighbors.py", line 53, in knn_1
    distances, neighbors, _, _ = frnn.frnn_grid_points(
  File "/home/user/桌面/Python/superpoint_transformer-master/src/dependencies/FRNN/frnn/frnn.py", line 331, in frnn_grid_points
    idxs, dists, sorted_points2, pc2_grid_off, sorted_points2_idxs, grid_params_cuda = _frnn_grid_points.apply(
  File "/home/user/anaconda3/envs/spt/lib/python3.8/site-packages/torch/autograd/function.py", line 553, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/home/user/桌面/Python/superpoint_transformer-master/src/dependencies/FRNN/frnn/frnn.py", line 174, in forward
    idxs, dists = _C.find_nbrs_cuda(sorted_points1, sorted_points2,
RuntimeError: Unknown layout

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
drprojects commented 2 months ago

Hi @xiarobin, as you can see in the traceback, the error seems to come from FRNN. This is the library we are using for fast neighbor search on GPU. Several users have reported issues with this dependency. Please make sure FRNN is properly installed. Also, look into passed issues related to FRNN to see if the solution is not already there.

It is possible that the error trace is not entirely returned, you can try setting HYDRA_FULL_ERROR=1 as suggested. Maybe this will return a more informative feedback.

drprojects commented 2 months ago

PS: if you ❤️ or simply use this project, don't forget to give it a ⭐, it means a lot to us !

xiarobin commented 2 months ago

Hi @xiarobin, as you can see in the traceback, the error seems to come from FRNN. This is the library we are using for fast neighbor search on GPU. Several users have reported issues with this dependency. Please make sure FRNN is properly installed. Also, look into passed issues related to FRNN to see if the solution is not already there.

It is possible that the error trace is not entirely returned, you can try setting HYDRA_FULL_ERROR=1 as suggested. Maybe this will return a more informative feedback.

Hi @drprojects , I have correctly installed the FRNN library according to install.sh and set HYDRA-FULL_ERROR=1 as per your suggestion, but the above error still occurs.

QQ20240419-152401

drprojects commented 2 months ago

From your screenshot, we can't really tell whether the installation went through all the way.

In any case, this is a FRNN-related issue. So you should investigate in this direction:

drprojects commented 2 months ago

After a 1-minute search of your error message on Google:

People seem to solve this by downgrading PyTorch version to 2.1.0. Can you please try this and let us know ?

gvoysey commented 2 months ago

pinning torch to 2.1.0 works for me, on ubuntu 22.04 and arch.

drprojects commented 2 months ago

I could install and run with torch 2.2.0 without problem on my end. It seems this issues is machine-dependent and can be fixed with a downgrade to torch 2.1.0. Closing this now.

Avril-Dragon commented 4 days ago

Hello, I would like to ask the specific configuration. I tried a lot of torch versions include 2.1.0 and 2.2.0 and 2.2.2 and 2.3.0.