drprojects / superpoint_transformer

Official PyTorch implementation of Superpoint Transformer introduced in [ICCV'23] "Efficient 3D Semantic Segmentation with Superpoint Transformer" and SuperCluster introduced in [3DV'24 Oral] "Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering"
MIT License
621 stars 78 forks source link

How should I perform supercluster based on an example code? #140

Closed yulongheart closed 4 months ago

yulongheart commented 4 months ago

https://github.com/drprojects/superpoint_transformer/issues/136#issuecomment -2229942164 Hello, I have used this example program to predict and classify my point cloud very well, but it only has superpoint. How should I do panoramic classification based on this example? Thank you.

drprojects commented 4 months ago

Hi @yulongheart, I do not see anything from you in #136, please provide the "example program" here, in the dedicated issue. Please clarify "it only has superpoint". Finally, I assume you mean "panoptic segmentation" and not "panoramic classification" ?

If you ❤️ or use this project, don't forget to give it a ⭐, it means a lot to us !

yulongheart commented 4 months ago

Hello, sorry, it was my unclear description that confused you. It is indeed a panoptic segmentation. I ran it using this example code. I would like to ask, how can I obtain the results of supercluster?

import os
import sys

# Add the project's files to the python path
file_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))  # for .py script
# file_path = os.path.dirname(os.path.abspath(''))  # for .ipynb notebook
sys.path.append(file_path)

# Necessary for advanced config parsing with hydra and omegaconf
from omegaconf import OmegaConf
OmegaConf.register_new_resolver("eval", eval)

import hydra
from src.utils import init_config
import torch
from src.visualization import show
from src.datasets.kitti360 import CLASS_NAMES, CLASS_COLORS, read_kitti360_window
from src.datasets.kitti360 import KITTI360_NUM_CLASSES as NUM_CLASSES
from src.transforms import *

# Parse the configs using hydra
cfg = init_config(overrides=[
    "experiment=kitti360",
    "ckpt_path=./media/spt-2_kitti360.ckpt"
])

# Instantiate the datamodule
datamodule = hydra.utils.instantiate(cfg.datamodule)
print(f"Data transorms : {datamodule}")

path = "/media/shi/projects/superpoint_transformer/data/kitti360/raw/data_3d_semantics/2013_05_28_drive_0008_sync/static/0000002769_0000003002.ply"
data = read_kitti360_window(path)
print(f"number of points: {data.num_points}\n keys: {data.keys}")

# Apply pre-transforms
nag = datamodule.pre_transform(data)
# Simulate the behavior of the dataset's I/O behavior with only
# `point_load_keys` and `segment_load_keys` loaded from disk
from src.transforms import NAGRemoveKeys
nag = NAGRemoveKeys(level=0, keys=[k for k in nag[0].keys if k not in cfg.datamodule.point_load_keys])(nag)
nag = NAGRemoveKeys(level='1+', keys=[k for k in nag[1].keys if k not in cfg.datamodule.segment_load_keys])(nag)

# Move to device
nag = nag.cuda()

# Apply on-device transforms
nag = datamodule.on_device_test_transform(nag)

# Instantiate the model
model = hydra.utils.instantiate(cfg.model)

# Load pretrained weights from a checkpoint file
model = model._load_from_checkpoint(cfg.ckpt_path)
model = model.eval().cuda()

# Inference
logits = model(nag)

# If the model outputs multi-stage predictions, we take the first one, 
# corresponding to level-1 predictions 
if model.multi_stage_loss:
    logits = logits[0]

# Compute the level-0 (pointwise) predictions based on the predictions
# on level-1 superpoints
l1_preds = torch.argmax(logits, dim=1).detach()
l0_preds = l1_preds[nag[0].super_index]

print(f"number of pred: {l0_preds.shape[0]}")

# Save predictions for visualization in the level-0 Data attributes 
nag[0].pred = l0_preds
drprojects commented 4 months ago

The script you wrote seems to be applying Superpoint Transformer (SPT) pretrained on KITTI-360 on a new point cloud. SPT is for semantic segmentation. To do panoptic segmentation, you should use SuperCluster. Please see the README and our papers arXiv arXiv for more on this.

The configs you want are in configs/experiment/panoptic rather than configs/experiment/semantic:

cfg = init_config(overrides=[f"experiment=panoptic/dales"])

As already stated in the README, the pretrained SuperCluster weights can be downloaded from DOI.

For manipulating the the PanopticSegmentationOutput returned by SuperCluster, and possibly converting it to voxel-level or full-resolution labels, refer to the demo.ipynb notebook.

Finally, I notice you are using an actual cloud from KITTI-360 in your script. If you are simply trying to run the inference on KITTI-360 val, you could:

If you ❤️ or use this project, don't forget to give it a ⭐, it means a lot to us !

yulongheart commented 4 months ago

Hello, thank you for your outstanding contribution and response. This is the result I obtained using supercluster_st3dis_fold5. ckpt prediction. I feel that there are still many color blocks instead of the effect shown in the example. I don't know if my calling method is correct or if this will be an expected result.

2024-07-20 17-48-03 的屏幕截图

2024-07-20 17-53-54 的屏幕截图

import os
import sys
import time

# Add the project's files to the python path
file_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))  # for .py script
# file_path = os.path.dirname(os.path.abspath(''))  # for .ipynb notebook
sys.path.append(file_path)

# Necessary for advanced config parsing with hydra and omegaconf
from omegaconf import OmegaConf
OmegaConf.register_new_resolver("eval", eval)

import hydra
from src.utils import init_config
import torch
from src.visualization import show
from src.datasets.s3dis import CLASS_NAMES, CLASS_COLORS, read_s3dis_room
from src.datasets.s3dis import S3DIS_NUM_CLASSES as NUM_CLASSES
from src.transforms import *
import multiprocessing
import numpy as np

start_time = time.time()

# Parse the configs using hydra
cfg = init_config(overrides=[
    "experiment=panoptic/s3dis",
   "ckpt_path=/home/workspace2/SupPointTrans_ws/superpoint_transformer/media/s3dis/supercluster_s3dis_fold5.ckpt"
])

# Instantiate the datamodule
datamodule = hydra.utils.instantiate(cfg.datamodule)
print(f"Data transorms : {datamodule}")

path = "/home/workspace2/SupPointTrans_ws/superpoint_transformer/data"
data = read_s3dis_room(path)
print(f"number of points: {data.num_points}\n keys: {data.keys}")

# Apply pre-transforms
nag = datamodule.pre_transform(data)
# Simulate the behavior of the dataset's I/O behavior with only
# `point_load_keys` and `segment_load_keys` loaded from disk
from src.transforms import NAGRemoveKeys
nag = NAGRemoveKeys(level=0, keys=[k for k in nag[0].keys if k not in cfg.datamodule.point_load_keys])(nag)
nag = NAGRemoveKeys(level='1+', keys=[k for k in nag[1].keys if k not in cfg.datamodule.segment_load_keys])(nag)

# Move to device
nag = nag.cuda()

# Apply on-device transforms
nag = datamodule.on_device_test_transform(nag)

# Instantiate the model
model = hydra.utils.instantiate(cfg.model)

# Load pretrained weights from a checkpoint file
model = model._load_from_checkpoint(cfg.ckpt_path)
model = model.eval().cuda()

# Inference, returns a task-specific ouput object carrying predictions
with torch.no_grad():
    output = model.forward(nag)
    show(nag, cmap=CLASS_COLORS, labels_to_names=CLASS_NAMES)

pred = output.semantic_pred
lab = pred.cpu().numpy()
pos = nag[0].pos[:].cpu().numpy()
super_index = nag[0].super_index.cpu().numpy()
labels = lab[super_index]
data = np.concatenate((pos,labels.reshape(-1,1)),axis=1)

np.savetxt("data/lidar_spc_s3dis.txt",data)
end_time = time.time()
print(f"Total time: {end_time - start_time:.2f} seconds")
drprojects commented 3 months ago

Your lines

pred = output.semantic_pred
lab = pred.cpu().numpy()
pos = nag[0].pos[:].cpu().numpy()
super_index = nag[0].super_index.cpu().numpy()
labels = lab[super_index]
data = np.concatenate((pos,labels.reshape(-1,1)),axis=1)

are for computing the predicted semantic segmentation for each voxel of $P_0$. Since you are doing panoptic segmentation, you should rather use something similar to what I put in the demo.ipynb to recover the voxel-wise panoptic predictions:

vox_y, vox_index, vox_obj_pred = output.voxel_panoptic_pred(super_index=nag[0].super_index)
nag[0].obj_pred = vox_obj_pred

Now, your illustration suggests you have neither floor nor ceiling in your data. An S3DIS-pretrained model will probably not like this, because it has been trained to see these classes in every room. So I think you may want to train a model from scratch on your data if you have annotations. Another trick would be to manually add "fake" floor and ceiling points to your data. This should be relatively easy. Note that if you used a ScanNet-pretrained model, you would only need to add floor points, since ScanNet rooms do not have ceilings.

That being said, due to the large number of requests I receive, I only provide support for issues relative to code I wrote and released, and I cannot provide support for tuning Superpoint Transformer or SuperCluster on individual use cases.

yulongheart commented 3 months ago

Okay, sorry to bother you. Thank you very much. I will adjust the test myself. Finally, I would like to express my gratitude

ImaneTopo commented 3 months ago

Hello, sorry, it was my unclear description that confused you. It is indeed a panoptic segmentation. I ran it using this example code. I would like to ask, how can I obtain the results of supercluster?

import os
import sys

# Add the project's files to the python path
file_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))  # for .py script
# file_path = os.path.dirname(os.path.abspath(''))  # for .ipynb notebook
sys.path.append(file_path)

# Necessary for advanced config parsing with hydra and omegaconf
from omegaconf import OmegaConf
OmegaConf.register_new_resolver("eval", eval)

import hydra
from src.utils import init_config
import torch
from src.visualization import show
from src.datasets.kitti360 import CLASS_NAMES, CLASS_COLORS, read_kitti360_window
from src.datasets.kitti360 import KITTI360_NUM_CLASSES as NUM_CLASSES
from src.transforms import *

# Parse the configs using hydra
cfg = init_config(overrides=[
    "experiment=kitti360",
    "ckpt_path=./media/spt-2_kitti360.ckpt"
])

# Instantiate the datamodule
datamodule = hydra.utils.instantiate(cfg.datamodule)
print(f"Data transorms : {datamodule}")

path = "/media/shi/projects/superpoint_transformer/data/kitti360/raw/data_3d_semantics/2013_05_28_drive_0008_sync/static/0000002769_0000003002.ply"
data = read_kitti360_window(path)
print(f"number of points: {data.num_points}\n keys: {data.keys}")

# Apply pre-transforms
nag = datamodule.pre_transform(data)
# Simulate the behavior of the dataset's I/O behavior with only
# `point_load_keys` and `segment_load_keys` loaded from disk
from src.transforms import NAGRemoveKeys
nag = NAGRemoveKeys(level=0, keys=[k for k in nag[0].keys if k not in cfg.datamodule.point_load_keys])(nag)
nag = NAGRemoveKeys(level='1+', keys=[k for k in nag[1].keys if k not in cfg.datamodule.segment_load_keys])(nag)

# Move to device
nag = nag.cuda()

# Apply on-device transforms
nag = datamodule.on_device_test_transform(nag)

# Instantiate the model
model = hydra.utils.instantiate(cfg.model)

# Load pretrained weights from a checkpoint file
model = model._load_from_checkpoint(cfg.ckpt_path)
model = model.eval().cuda()

# Inference
logits = model(nag)

# If the model outputs multi-stage predictions, we take the first one, 
# corresponding to level-1 predictions 
if model.multi_stage_loss:
    logits = logits[0]

# Compute the level-0 (pointwise) predictions based on the predictions
# on level-1 superpoints
l1_preds = torch.argmax(logits, dim=1).detach()
l0_preds = l1_preds[nag[0].super_index]

print(f"number of pred: {l0_preds.shape[0]}")

# Save predictions for visualization in the level-0 Data attributes 
nag[0].pred = l0_preds

@yulongheart Helllo, I would like to ask you if you could provide me with your email adress, because iI have some questions, I'm stuck in a step of visualization.

ImaneTopo commented 3 months ago

Hello, thank you for your outstanding contribution and response. This is the result I obtained using supercluster_st3dis_fold5. ckpt prediction. I feel that there are still many color blocks instead of the effect shown in the example. I don't know if my calling method is correct or if this will be an expected result.

2024-07-20 17-48-03 的屏幕截图

2024-07-20 17-53-54 的屏幕截图

import os
import sys
import time

# Add the project's files to the python path
file_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))  # for .py script
# file_path = os.path.dirname(os.path.abspath(''))  # for .ipynb notebook
sys.path.append(file_path)

# Necessary for advanced config parsing with hydra and omegaconf
from omegaconf import OmegaConf
OmegaConf.register_new_resolver("eval", eval)

import hydra
from src.utils import init_config
import torch
from src.visualization import show
from src.datasets.s3dis import CLASS_NAMES, CLASS_COLORS, read_s3dis_room
from src.datasets.s3dis import S3DIS_NUM_CLASSES as NUM_CLASSES
from src.transforms import *
import multiprocessing
import numpy as np

start_time = time.time()

# Parse the configs using hydra
cfg = init_config(overrides=[
    "experiment=panoptic/s3dis",
   "ckpt_path=/home/workspace2/SupPointTrans_ws/superpoint_transformer/media/s3dis/supercluster_s3dis_fold5.ckpt"
])

# Instantiate the datamodule
datamodule = hydra.utils.instantiate(cfg.datamodule)
print(f"Data transorms : {datamodule}")

path = "/home/workspace2/SupPointTrans_ws/superpoint_transformer/data"
data = read_s3dis_room(path)
print(f"number of points: {data.num_points}\n keys: {data.keys}")

# Apply pre-transforms
nag = datamodule.pre_transform(data)
# Simulate the behavior of the dataset's I/O behavior with only
# `point_load_keys` and `segment_load_keys` loaded from disk
from src.transforms import NAGRemoveKeys
nag = NAGRemoveKeys(level=0, keys=[k for k in nag[0].keys if k not in cfg.datamodule.point_load_keys])(nag)
nag = NAGRemoveKeys(level='1+', keys=[k for k in nag[1].keys if k not in cfg.datamodule.segment_load_keys])(nag)

# Move to device
nag = nag.cuda()

# Apply on-device transforms
nag = datamodule.on_device_test_transform(nag)

# Instantiate the model
model = hydra.utils.instantiate(cfg.model)

# Load pretrained weights from a checkpoint file
model = model._load_from_checkpoint(cfg.ckpt_path)
model = model.eval().cuda()

# Inference, returns a task-specific ouput object carrying predictions
with torch.no_grad():
    output = model.forward(nag)
    show(nag, cmap=CLASS_COLORS, labels_to_names=CLASS_NAMES)

pred = output.semantic_pred
lab = pred.cpu().numpy()
pos = nag[0].pos[:].cpu().numpy()
super_index = nag[0].super_index.cpu().numpy()
labels = lab[super_index]
data = np.concatenate((pos,labels.reshape(-1,1)),axis=1)

np.savetxt("data/lidar_spc_s3dis.txt",data)
end_time = time.time()
print(f"Total time: {end_time - start_time:.2f} seconds")

@yulongheart Hello, I applied this code of visualisation on dales dataset, but the the file of predictions that was saved contains only XYZ columns and doesn't contain the columns of semantic and instance, so I would like to know if this awkwerd to you.