Segment Anything in 3D with NeRFs
Jiazhong Cen1*, Zanwei Zhou1*, Jiemin Fang2,3β , Chen Yang1, Wei Shen1β, Lingxi Xie2, Dongsheng Jiang2, Xiaopeng Zhang2, Qi Tian2
1AI Institute, SJTU 2Huawei Inc 3School of EIC, HUST .
*denotes equal contribution
β denotes project lead.
Given a NeRF, just input prompts from one single view and then get your 3D model.
We propose a novel framework to Segment Anything in 3D, named SA3D. Given a neural radiance field (NeRF) model, SA3D allows users to obtain the 3D segmentation result of any target object via only one-shot manual prompting in a single rendered view. The entire process for obtaining the target 3D model can be completed in approximately 2 minutes, yet without any engineering optimization. Our experiments demonstrate the effectiveness of SA3D in different scenes, highlighting the potential of SAM in 3D scene perception.
./dependencies/sam_ckpt
. You can use --mobile_sam
to switch to MobileSAM.With input prompts, SAM cuts out the target object from the according view. The obtained 2D segmentation mask is projected onto 3D mask grids via density-guided inverse rendering. 2D masks from other views are then rendered, which are mostly uncompleted but used as cross-view self-prompts to be fed into SAM again. Complete masks can be obtained and projected onto mask grids. This procedure is executed via an iterative manner while accurate 3D masks can be finally learned. SA3D can adapt to various radiance fields effectively without any additional redesigning.
git clone https://github.com/Jumpat/SegmentAnythingin3D.git
cd SegmentAnythingin3D
conda create -n sa3d python=3.10
conda activate sa3d
pip install -r requirements.txt
# Installing SAM
mkdir dependencies; cd dependencies
mkdir sam_ckpt; cd sam_ckpt
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
git clone git@github.com:facebookresearch/segment-anything.git
cd segment-anything; pip install -e .
# Installing Grounding-DINO
git clone https://github.com/IDEA-Research/GroundingDINO.git
cd GroundingDINO/; pip install -e .
mkdir weights; cd weights
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
We now release the configs on these datasets:
python run.py --config=configs/llff/fern.py --stop_at=20000 --render_video --i_weights=10000
python run_seg_gui.py --config=configs/llff/seg/seg_fern.py --segment \
--sp_name=_gui --num_prompts=20 \
--render_opt=train --save_ckpt
python run_seg_gui.py --config=configs/llff/seg/seg_fern.py --segment \
--sp_name=_gui --num_prompts=20 \
--render_only --render_opt=video --dump_images \
--seg_type seg_img seg_density
Some tips when run SA3D:
--num_prompts
when the target object is extremely irregular like LLFF scenes Fern and Trex;--seg_poses
to specify the camera pose sequence used for training 3D mask, default='train', choices=['train', 'video']
.Using our Dash based GUI:
Select which type of prompt to be used, currently support: Point Prompt and Text Prompt;
Points
in the drop down; click the original image to add a point prompt, then SAM will produce candidate masks; click Clear Points
to clear out the previous inputs;https://github.com/Jumpat/SegmentAnythingin3D/assets/58475180/9ae39cb2-6a1f-40a7-b7df-6b149e75358f
Text
in the drop down;input your text prompt and click Generate
to get candidate masks; note that unreasonable text input may cause error.https://github.com/Jumpat/SegmentAnythingin3D/assets/58475180/ba934e0c-dc8a-472a-958c-2b6c4d6ee644
Select your target mask;
Press Start Training
to run SA3D; we visualize rendered masks and SAM predictions produced by our cross-view self-prompting stategy;
https://github.com/Jumpat/SegmentAnythingin3D/assets/58475180/c5cc947e-8966-4ec5-9531-434a7b27eed5
Wait a few minutes to see the final rendering results.
https://github.com/Jumpat/SegmentAnythingin3D/assets/58475180/9578ea7a-0947-4105-a65c-1f8de12d0bb5
SA3D can handle various scenes for 3D segmentation. Find more demos in our project page.
Forward facing | 360Β° | Multi-objects |
---|---|---|
Thanks for the following project for their valuable contributions:
If you find this project helpful for your research, please consider citing the report and giving a β.
@inproceedings{cen2023segment,
title={Segment Anything in 3D with NeRFs},
author={Jiazhong Cen and Zanwei Zhou and Jiemin Fang and Chen Yang and Wei Shen and Lingxi Xie and Dongsheng Jiang and Xiaopeng Zhang and Qi Tian},
booktitle = {NeurIPS},
year = {2023},
}