Closed VolkaJ closed 2 months ago
Hi,
is there a way to execute this command globally?
Sure, you can conduct pip install -e .
under the SegmentAnything3D directory and then you can execute ns-train sa3d ~
globally, as mentioned on this page.
Are these the only outputs generated from the training process?
Currently the nerfstudio-version SA3D does not support generating 3D masks. You can try the original SA3D code which is based on DVGO.
what does the "--pipeline.network.num_prompts" option do?
It is a hyperparameter to control the maximum number of point prompts we use in the stage of self-prompting. When you want to segment an object with a simple shape, setting it to be 3 ~ 5 is ok; when the target object has a complex shape (like fern), you may set it to be larger (10 ~ 20).
Thank you so much for the explanation. I didn't know what num_prompts's for so I set it to 1, that's probably why I got the weird video output. I'll try again.
By the way, I'm currently trying to run the segmentation on the remote server and I can't use GUI because I want to automate the process.
Is there any way I can automate the process only using fixed text prompt and also get the segmentation result? I'm tryting to get pointcloud of multiple segmented objects.
If it's impossible with the current version, I'd very much appreciate any kind of tips or guides. Thank you so much.
I think you can modify the code to save the point cloud after the training stage, such as saving pointcloud here.
To generate point cloud, you can loop the training set, use the depth (and mask) from the nerf model to calculate target points in the world coordinate system, and finally merge all the points to obtain the point cloud.
Comment out line 275 ~ 278 here to make the code end immediately without ctrl + C
.
Thanks for the advice. I'll work on it.
Probably one last question on this thread.
I fine-tuned the pre-trained GroundingDino model, and when I tested it, it worked fine. But when I run 'ns-train sa3d ~' with a fine-tuned GroundingDino model. The GroundingDino gets zero bounding box. (Get box from GroundingDino: []).
I guess the the input for GroundingDino to make initial mask was generated image by Nerf right? Does this mean Nerf wasn't trained very well? I'm having a trouble finding out what might be the problem here.
the input for GroundingDino to make initial mask was generated image by Nerf right?
Yes, we use the image rendered from Nerf, check this. You can save this image to see what is wrong.
You can save this image to see what is wrong.
Yeah, it's definitely wrong. The image (rendered from Nerf) seems irrelavant.
Can you let me know where 'batch' is from? I wonder how the nerf model rendered that image... hmm
Can you let me know where 'batch' is from?
This is the 'batch'.
You might check if your model correctly load the pretrained nerf checkpoint.
Hmm nerf checkpoint was loaded correctly. And I just found out the image saved before 'get_outputs_for_camera_ray_bundle' ()
imageio.imwrite(f"batch_image_{self.image_count}.png", batch["image"].squeeze().cpu().numpy())
looks totally fine, but the image saved after 'get_outputs_for_camera_ray_bundle'
imageio.imwrite(f"model_outputs_{self.image_count}.png", model_outputs["rgb"].squeeze().cpu().numpy())
looks all green.
There might be something wrong with that function?
get_outputs_for_camera_ray_bundle()
receives the camera pose as input and output rgb image, depth, mask, etc.
batch["image"]
is the ground truth image, and model_outputs["rgb"]
is the image generated by Nerf.
I have already test this nerfstudio-version code and it seems to be ok. I suggest the followings:
nerfstudio==0.2.0
(Though I have tested under 1.0.2);ns-viewer
(like this script) to check if your pretrained nerf model is fine;Thank you for your patience. I've been helped a lot.
Very strange......
Could you provide more details, like the scripts you have conducted (to train nerf and to segment), and samples of your dataset?
Oh! I just realized that I installed sa3d on nerfstudio=0.2.0 version but I trained the nerf model on nerfstudio=1.0.0 docker image. I'll come back after finishing training on 0.2.0. Hopefully this solved the issue.
Turns out it was the consistency of the nerfstudio version problem. Thanks a lot for helping me find out the problem.
Still having a problem with my own dataset, but I think I'm almost there.
Probably an obvious and self-explanatory question but, Is GroundingDino supposed to find the bounding box for all input images? If so, there shouldn't be any images that don't have the object that I want to segment in a nerf-training stage, right?
We only use GroundingDino to find the mask in the first input image. After that, the SAM model will be used to automatically find the target object in the training set images and complete the segmentation. Therefore, you do not need to worry when there are images without the object you want.
Oh, then that could be a serious problem cause it seems like the input camera pose is decided by a certain logic of camera optimizer. So I can't guarantee if a single view of my training dataset doesn't contain the target object, the program die right? Do you think I understand it right?
I tested two different datasets but the very first model_output['rgb'] always looks sparse. Except for the first one, the rest of them looked fine. Can you guess why it is? [Edited] First 8-10 images look bad and it gets better.. Gotta look into what it's about..
Hi, thank you for sharing your excellent work. I've successfully installed everything as required (using the nerfstudio-version branch and my own dataset), but I have a few questions:
Any assistance you can provide would be greatly appreciated. Thank you.