filaPro / oneformer3d

[CVPR2024] OneFormer3D: One Transformer for Unified Point Cloud Segmentation
Other
348 stars 32 forks source link

How to export point cloud with instance masks when inference #57

Closed tsrobcvai closed 5 months ago

tsrobcvai commented 5 months ago

Hi, thanks for your great work! The current repo only supports exporting metrics results during inference, but I would like to visualize the point cloud with predicted masks.

  1. In the "predict" function here end, pts_instance_mask is expected to be with a shape (num_points, num_instances) of type bool. Is there any guidance to use this function to get the point cloud with predicted masks?

  2. Besides, I debug the "pred_instance_masks" below for testing. I try to get the mask info. to generate the point cloud with masks, but I don't know how to connect the prediction (Tensor shape is (num_instances, ?) ) to the input 3D points (sorry, I am a rookie in the field of 3d point cloud segmentation). Any comments are very appreciated!

preds = aggregate_predictions( masks=pred_instance_masks, labels=pred_instance_labels, scores=pred_instance_scores, valid_class_ids=valid_class_ids)

Code Link

Debug results:

image

  1. Appreciate any ideas to get the point cloud with predicted masks
oneformer3d-contributor commented 5 months ago

These tensor sizes look a little bit strange to me. On which dataset and split are you running now?

tsrobcvai commented 5 months ago

Thanks for your kind reply! I run the network on a custom synthetic dataset (in the format of S3DIS dataset) created by Blender. I just figured out this: the 2nd item of the shape (num_instances, num_points) is the number of input points. So, for each mask, we get a binary segmentation mask with a shape (num_points). The tensor does not contain the coordinate info of each point. If I understand correctly, there is a way to get the position of each point in the format of S3DIS dataset. Based on that, we can get the point cloud with masks. Would you please give me any advice?

I might know how to do it. The point sequence of the binary masks should be the same as the input point cloud.

oneformer3d-contributor commented 5 months ago

Yes you can save points from the beginning of predict function and the predicted masks from its end. The orders should be the same.

tsrobcvai commented 5 months ago

Hi, thanks for your help! I got the points with masks successfully. One more question: it seems there is no explicitly defined confidence threshold to filter poor predictions at prediction time, right? https://github.com/oneformer3d/oneformer3d/blob/d74fa0cc9cda2c967b26cf7df3b99e7aedbf2086/oneformer3d/evaluate_semantic_instance.py#L10

So, I set 0.5 as a threshold to get the final result. Here is a straightforward way to get the points with masks. (comment this line and use the code below when inference). https://github.com/oneformer3d/oneformer3d/blob/d74fa0cc9cda2c967b26cf7df3b99e7aedbf2086/oneformer3d/oneformer3d.py#L419 Hope it can help others.

import os
import numpy as np
import open3d as o3d

pred_pts_seg = batch_data_samples[0].pred_pts_seg
instance_labels  = pred_pts_seg.instance_labels # tensor, (num_instance,)
instance_scores = pred_pts_seg.instance_scores # tensor, (num_instance,)
pts_instance_mask = pred_pts_seg.pts_instance_mask[0] # tensor, (num_instances, num_points)
input_points = batch_inputs_dict["points"][0] # tensor, (num_points, xyzrgb)
input_point_name = batch_data_samples[0].lidar_path.split('/')[-1].split('.')[0]

def save_point_cloud(points, file_path):
    if isinstance(points, torch.Tensor):
        points = points.cpu().numpy()  # Convert tensor to NumPy array on the CPU
    points = np.asarray(points, dtype=np.float32)  # Ensure points are in float32 format
    pc = o3d.geometry.PointCloud()
    pc.points = o3d.utility.Vector3dVector(points[:, :3])
    pc.colors = o3d.utility.Vector3dVector(points[:, 3:])
    o3d.io.write_point_cloud(file_path, PC)

def filter_and_save_instances(instance_labels, instance_scores, pts_instance_mask, input_points,input_point_name, threshold=0.5):

    base_dir = f"./work_dirs/{input_point_name}"
    if not os.path.exists(base_dir):
        os.makedirs(base_dir)
    input_pc_path = os.path.join(base_dir, f"{input_point_name}.ply")
    save_point_cloud(input_points, input_pc_path)

    instance_count = {}
    for i in range(len(instance_scores)):
        if instance_scores[i] >= threshold:
            label = instance_labels[i].item()
            if label not in instance_count:
                instance_count[label] = 0
            instance_count[label] += 1

            instance_mask = pts_instance_mask[i].astype(bool)
            instance_points = input_points[instance_mask]

            instance_pc_path = os.path.join(base_dir, f"{input_point_name}_{label}_{instance_count[label]}.ply")
            save_point_cloud(instance_points, instance_pc_path)

filter_and_save_instances(instance_labels, instance_scores, pts_instance_mask, input_points, input_point_name)

return batch_data_samples
oneformer3d-contributor commented 5 months ago

I think this threshold is inst_score_thr in config. Se set it to 0 as it doesn't influence on mAP metrics. But yes, for visualization smth like 0.3 - 0.5 should be fine.

tsrobcvai commented 5 months ago

I think this threshold is inst_score_thr in config. Se set it to 0 as it doesn't influence on mAP metrics. But yes, for visualization smth like 0.3 - 0.5 should be fine.

Thanks! I am wondering Why we don't need the score threshold when calculating mAP metrics. There should be a lot of FP (a lot of predicted instance masks exist) without a poor prediction filter. I will check the code, but it would be great if you could give some more tips.

oneformer3d-contributor commented 5 months ago

I think we calculate the area under precision-recall curve at several thresholds. And all these low confidence predictions are smaller than the first threshold, so have no effect on overall metric. I think it is also true for 2d detection.

tsrobcvai commented 5 months ago

When plotting a PR curve, for low confidence predictions, P becomes low but we already have a high Recall value, so they have minior effect on overall mAP. Right? Thanks for your explanation! Close this issue now.

Lizhinwafu commented 5 months ago

Hi, thanks for your help! I got the points with masks successfully. One more question: it seems there is no explicitly defined confidence threshold to filter poor predictions at prediction time, right?

https://github.com/oneformer3d/oneformer3d/blob/d74fa0cc9cda2c967b26cf7df3b99e7aedbf2086/oneformer3d/evaluate_semantic_instance.py#L10

So, I set 0.5 as a threshold to get the final result. Here is a straightforward way to get the points with masks. (comment this line and use the code below when inference).

https://github.com/oneformer3d/oneformer3d/blob/d74fa0cc9cda2c967b26cf7df3b99e7aedbf2086/oneformer3d/oneformer3d.py#L419

Hope it can help others.

import os
import numpy as np
import open3d as o3d

pred_pts_seg = batch_data_samples[0].pred_pts_seg
instance_labels  = pred_pts_seg.instance_labels # tensor, (num_instance,)
instance_scores = pred_pts_seg.instance_scores # tensor, (num_instance,)
pts_instance_mask = pred_pts_seg.pts_instance_mask[0] # tensor, (num_instances, num_points)
input_points = batch_inputs_dict["points"][0] # tensor, (num_points, xyzrgb)
input_point_name = batch_data_samples[0].lidar_path.split('/')[-1].split('.')[0]

def save_point_cloud(points, file_path):
    if isinstance(points, torch.Tensor):
        points = points.cpu().numpy()  # Convert tensor to NumPy array on the CPU
    points = np.asarray(points, dtype=np.float32)  # Ensure points are in float32 format
    pc = o3d.geometry.PointCloud()
    pc.points = o3d.utility.Vector3dVector(points[:, :3])
    pc.colors = o3d.utility.Vector3dVector(points[:, 3:])
    o3d.io.write_point_cloud(file_path, PC)

def filter_and_save_instances(instance_labels, instance_scores, pts_instance_mask, input_points,input_point_name, threshold=0.5):

    base_dir = f"./work_dirs/{input_point_name}"
    if not os.path.exists(base_dir):
        os.makedirs(base_dir)
    input_pc_path = os.path.join(base_dir, f"{input_point_name}.ply")
    save_point_cloud(input_points, input_pc_path)

    instance_count = {}
    for i in range(len(instance_scores)):
        if instance_scores[i] >= threshold:
            label = instance_labels[i].item()
            if label not in instance_count:
                instance_count[label] = 0
            instance_count[label] += 1

            instance_mask = pts_instance_mask[i].astype(bool)
            instance_points = input_points[instance_mask]

            instance_pc_path = os.path.join(base_dir, f"{input_point_name}_{label}_{instance_count[label]}.ply")
            save_point_cloud(instance_points, instance_pc_path)

filter_and_save_instances(instance_labels, instance_scores, pts_instance_mask, input_points, input_point_name)

return batch_data_samples

How to run this code, I also want to get the predicted point cloud.

accoumar12 commented 1 month ago

Thanks for your suggestion. For getting a unique prediction for each point, what is the recommended method, that would allow to draw the same visualizations as shown in the paper ? Should we look for each point which instance has the highest score and choose this instance ?

alphabet-lgtm commented 4 days ago

Hi, thanks for your help! I got the points with masks successfully. One more question: it seems there is no explicitly defined confidence threshold to filter poor predictions at prediction time, right? https://github.com/oneformer3d/oneformer3d/blob/d74fa0cc9cda2c967b26cf7df3b99e7aedbf2086/oneformer3d/evaluate_semantic_instance.py#L10

So, I set 0.5 as a threshold to get the final result. Here is a straightforward way to get the points with masks. (comment this line and use the code below when inference). https://github.com/oneformer3d/oneformer3d/blob/d74fa0cc9cda2c967b26cf7df3b99e7aedbf2086/oneformer3d/oneformer3d.py#L419 Hope it can help others.

import os
import numpy as np
import open3d as o3d

pred_pts_seg = batch_data_samples[0].pred_pts_seg
instance_labels  = pred_pts_seg.instance_labels # tensor, (num_instance,)
instance_scores = pred_pts_seg.instance_scores # tensor, (num_instance,)
pts_instance_mask = pred_pts_seg.pts_instance_mask[0] # tensor, (num_instances, num_points)
input_points = batch_inputs_dict["points"][0] # tensor, (num_points, xyzrgb)
input_point_name = batch_data_samples[0].lidar_path.split('/')[-1].split('.')[0]

def save_point_cloud(points, file_path):
    if isinstance(points, torch.Tensor):
        points = points.cpu().numpy()  # Convert tensor to NumPy array on the CPU
    points = np.asarray(points, dtype=np.float32)  # Ensure points are in float32 format
    pc = o3d.geometry.PointCloud()
    pc.points = o3d.utility.Vector3dVector(points[:, :3])
    pc.colors = o3d.utility.Vector3dVector(points[:, 3:])
    o3d.io.write_point_cloud(file_path, PC)

def filter_and_save_instances(instance_labels, instance_scores, pts_instance_mask, input_points,input_point_name, threshold=0.5):

    base_dir = f"./work_dirs/{input_point_name}"
    if not os.path.exists(base_dir):
        os.makedirs(base_dir)
    input_pc_path = os.path.join(base_dir, f"{input_point_name}.ply")
    save_point_cloud(input_points, input_pc_path)

    instance_count = {}
    for i in range(len(instance_scores)):
        if instance_scores[i] >= threshold:
            label = instance_labels[i].item()
            if label not in instance_count:
                instance_count[label] = 0
            instance_count[label] += 1

            instance_mask = pts_instance_mask[i].astype(bool)
            instance_points = input_points[instance_mask]

            instance_pc_path = os.path.join(base_dir, f"{input_point_name}_{label}_{instance_count[label]}.ply")
            save_point_cloud(instance_points, instance_pc_path)

filter_and_save_instances(instance_labels, instance_scores, pts_instance_mask, input_points, input_point_name)

return batch_data_samples

Good Job, bro!