Jumpat / SegmentAnythingin3D

Segment Anything in 3D with NeRFs (NeurIPS 2023)
Apache License 2.0
831 stars 52 forks source link

get the raw result #47

Closed Zhouyuan-Chen closed 4 months ago

Zhouyuan-Chen commented 6 months ago

Hi, Thanks for your work!

I have one question: How can I get the volumetric segmentation result(R, G, B, Sigma) from the program? Do you have some pre-train results for that? And I had the same problem which is pretty similar to this issue #24

Jumpat commented 6 months ago

Hello! Do you mean the3D binary segmentation masks? SA3D obtains the 3D binary mask after segmentation, you can use this mask and the original NeRF to get the final segmentation result (remove all other things in the scene).

Jumpat commented 6 months ago

Hello! Do you mean the3D binary segmentation masks? SA3D obtains the 3D binary mask after segmentation, you can use this mask and the original NeRF to get the final segmentation result (remove all other things in the scene).

How to obtain instance segmentation results with the code?

Hello, what do you mean by the 'instance segmentation results'? The 3D segmentation results are stored in the /logs (i.e., the .tar files). The 2D segmentation results are also rendered and stored in the logs.

Jumpat commented 6 months ago

--我训练了SA3D模型,但是最后只得到了分割好的某个目标的渲染视频,或者深度视频。sa3d能不能最后导出点云? --另一个问题是我的场景中有很多目标,我想对整个场景进行实例分割,但是我发现这个代码一次智能分割出场景中的一个目标。 --我想把sam的部分用mask rcnn的代替,不通过sam获得mask,您认为可以实现吗?

目前SA3D的代码中没有提供生成点云的功能,但这一功能可以通过使用NeRF估计出的深度进行映射来实现;

SA3D的设计中确实只能一次分割一个目标,同时分割多个目标本质上也是分割单个目标的并行操作,没有本质区别;

使用Mask-RCNN代替SAM是可以实现的,但代码实现同时也取决于任务目的——由于Mask-RCNN不是prompt驱动的分割模型,使用Mask-RCNN的问题在于如何将多视图分割结果关联起来(这取代了原有的Self-prompting过程,需要设计或使用新的算法),然后将关联好的多视图分割结果映射回3D空间(即mask inverse-rendering,可以理解为根据已有的几何信息训练一个Mask场)

Jumpat commented 6 months ago

--"同时分割多个目标本质上也是分割单个目标的并行操作",这句话的意思是,我通过sa3d每次分割出一个目标,通过多此迭代后,把所有分割出的目标再合并到一块?因为在手工给予prompt时,一张原始图片中只能通过sam得到一个mask.

是的

--由于我没怎么编过代码,想在sa3d的基础上,以最小的改动实现实例分割且不需要手工提供prompt。是不是可以通过一个目标检测网络(如yolo)得到bbox框,然后再接上sam,进而实现sa3d的后续流程。这样是不是更容易实现?

是的,在SA3D中,基于text prompt的分割正是使用了Grounded-DINO得到的bounding box