cuteyyt / or-nerf

OR-NeRF: Object Removing from 3D Scenes Guided by Multiview Segmentation with Neural Radiance Fields
https://ornerf.github.io
42 stars 3 forks source link

how to run on my own data #4

Closed D6561 closed 1 year ago

D6561 commented 1 year ago

thanks for your great job! but i have some problem about how can i run it on my own data.could you show me how it looks like in the folder"spinnerf_dataset" or how should i put my own data?

cuteyyt commented 1 year ago

Thanks for your interest in our project!

Could you provide more details about your goal? Do you have any camera parameters already? Would you like to use points or text prompts to generate the mask?

Roughly speaking, if you have sparse views/2D images, you can imitate the steps provided in README.md step by step and add necessary information during the process. Mainly you need to add some configurations to the JSON files in configs folder.

cuteyyt commented 1 year ago

I am not sure about the details of your "not match." As for the spinnerf dataset, you download it and unzip it anywhere you like. The original folder name is "spinnerf-dataset". You can see a dict structure from the below config file sam_text.json

image

"spinnerf_dataset" infers the dataset name, so you need to rename the original folder name from "spinnerf-dataset" to "spinnerf_dataset". "1" indicates the scene name, "text" is your prompt, and "factor" is the downsizing factor you want to apply to the original image. The other two config files in the configs/prepare_data folder share the same dict structure as dataset_name --- scene_name --- specific parameters. You may add related information using the same structure to use your custom data.

This introduces the mask generation stage briefly, while config files similarly control the 3D reconstruction stage. I hope the above information can help you.

D6561 commented 1 year ago

THX! My question now is that i want to use text prompt.but after i run "sh scripts/data/gen_sparse.sh spinnerf_dataset 2 data data",i only get the masks and bboxs.And it seems that i still need "num_points","points","mode". so how can i get the information of sam_points.json from the text prompt and the mask?

屏幕截图 2023-07-28 191024

also,when i try other datasets i found that if there are multiple mask it would not output any mask,but i still can get the bbox.and if i set the "multiple mask" true ,it will have other errors.

image image image
cuteyyt commented 1 year ago

For the first question, it is designed to get only masks and box folders for text prompts as it does not involve point prompts. I am unsure about what your question "still need num_points" means. And could you please provide more details?

For the second question, our method has problems generating masks for 360 scenes with point prompts. Projection errors may occur due to massive objects' location changes. While for the toydesk dataset with text prompt, it is also hard to perform it with text prompt as a bottleneck of GroundingDINO's detection ability. As you can see from your figure, GroundingDINO made some ambiguous results, making the mask generation hard. I am currently trying to solve these issues with a new project, and to be honest, the lama inpainting results for the toydesk dataset are also poor. This means the latter reconstruction performance could also be unsatisfactory. By the way, the multimask_output parameter here means SAM will predict multiple possible masks for the input prompt but not multiple masks in a single image. If it is set to True, the latter code may be conflicted.

D6561 commented 1 year ago

thank you for answering my questions!