Features
Datasets
Installation
Data Preparation
Training and Testing
Quick Demo
Visualization
Acknowledgments
Contacts

ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution

Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen
VinAI Research, Vietnam

Abstract: Existing 3D instance segmentation methods are predominant by a bottom-up design: a manually fine-tuned algorithm to group points into clusters followed by a refinement network. Relying on the quality of the clusters, these methods generate susceptible results when (1) nearby objects with the same semantic class are packed together, or (2) large objects with complex shapes. To address these shortcomings, we introduce ISBNet, a novel cluster-free method that represents instances as kernels and decodes instance masks via dynamic convolution. To efficiently generate a high-recall and discriminative kernel set, we propose a simple strategy, named Instance-aware Farthest Point Sampling, to sample candidates and leverage the point aggregation layer adopted from PointNet++ to encode candidate features. Moreover, we show that training 3D instance segmentation in a multi-task learning setting with an additional axis-aligned bounding box prediction head further boosts performance. Our method set new state-of-the-art results on ScanNetV2 (55.9), S3DIS (60.8), and STPLS3D (49.2) in terms of AP and retains fast inference time (237ms per scene on ScanNetV2).

overview

Details of the model architecture and experimental results can be found in our paper:

@inproceedings{ngo2023isbnet,
 author={Tuan Duc Ngo, Binh-Son Hua, Khoi Nguyen},
 booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
 title={ISBNet: a 3D Point Cloud Instance Segmentation Network with Instance-aware Sampling and Box-aware Dynamic Convolution},
 year= {2023}
}

Please CITE our paper whenever this repository is used to help produce published results or incorporated into other software.

Features :mega:

State-of-the-art performance on ScanNetV2, S3DIS, and STPLS3D.
High speed of 237ms per scan on ScanNetV2 dataset.
Reproducibility code for both ScanNetV2, S3DIS and STPLS3D datasets.

Datasets :floppy_disk:

[x] ScanNetV2
[x] ScanNetV2-200
[x] S3DIS
[x] STPLS3D

Installation :memo:

Please refer to installation guide.

Data Preparation :hammer:

Please refer to data preparation.

Training and Testing :train2:

Please refer to training guide.

Quick Demo :fire:

ScanNetv2

Dataset	AP	AP_50	Config	Checkpoint
ScanNet test	55.9	76.3
ScanNet val (paper)	54.5	73.1
ScanNet val	56.8	73.3	config	checkpoint
ScanNet val (lightweight)	50.1	68.9	config	checkpoint

ScanNetv2-200

Dataset	AP	AP_50	Config	Checkpoint
ScanNet200 val	24.5	32.7	config	checkpoint

S3DIS

Dataset	AP	AP_50	Config	Checkpoint
Area 5	56.3	67.5	config	checkpoint

STPLS3D

Dataset	AP	AP_50	Config	Checkpoint
STPLS3D val	51.2	66.7	config	checkpoint

Run evaluation with pre-trained models:

python3 tools/test.py <path_to_config_file> <path_to_pretrain_weight>

Visualization :computer:

Please refer to visualization guide. We provide the qualitative results of our method at here

Acknowledgements :clap:

This repo is built upon SpConv, DyCo3D, SSTNet, and SoftGroup.

Contacts :email:

If you have any questions or suggestions about this repo, please feel free to contact me (ductuan.ngo99@gmail.com).

VinAIResearch / ISBNet

readme

Table of contents