Limited performance on Custom Datasets

nv-nguyen / cnos

[ICCV 2023 R6D] PyTorch implementation of CNOS: A Strong Baseline for CAD-based Novel Object Segmentation based on Segmenting Anything and DINOv2

MIT License

191 stars 21 forks source link

Limited performance on Custom Datasets #9

Closed jakob-ropers-snkeos closed 1 year ago

jakob-ropers-snkeos commented 1 year ago

I tried this code and it works beautifully on the provided datasets. Considering the shots are cluttered with all kinds of objects, there are occlusions, etc. I am extremely impressed by the performance on this. However, as soon as I move to a custom dataset, this performance is not repeatable whatsoever. I am trying this out on surgical tools, for which I have an accurate CAD model. The images I tried it on are close-ups of the objects, the single object only, a white background, and no occlusion; in other words, as simple as it gets and technically a perfect template match. However, the model either only predicts a tiny part of the object or just something completely wrong like the entire background (everything but the object). Could you maybe comment on the types of objects this works well on (the objects in your datasets seem a bit more bulky while the surgical tools are more skinny) or whether there are any tricks to improving this performance?

nv-nguyen commented 1 year ago

Can you share the results here (+input images, renderings of CAD model)? In the custom inference script, I set confidence_score=0.5 by default. It may be helpful to use a lower threshold to visualize more detections.

nv-nguyen commented 1 year ago

From what you said “close-ups of the objects, the single object only, a white background, and no occlusion” but there is still no segmentation on objects are so weird since SAM or FastSAM should segment everything. You can reduce confidence_threshold to see all detections first and make sure that there are at least some masks on objects.

jakob-ropers-snkeos commented 1 year ago

Thank you for your very fast reply!

So here is an example where it worked best for me. As you can see, it still does not segment the entire object: Templates half_confidence

When I do it for another tool, I only get the background for some reason: template02 bad_results

I am confused because as soon as I use a CAD model and RGB image from one of the example datasets it segments it absolutely perfectly despite the object being far away from the camera in an extremely cluttered space. Lowering the confidence threshold did not help by the way. I appreciate your help!

nv-nguyen commented 1 year ago

CNOS should work on this example. Can you share the RGB image + CAD model so that I can try it?

jakob-ropers-snkeos commented 1 year ago

Sure! Here are two example images and the CAD model of the needle holder:

https://drive.google.com/drive/folders/1pn7letIZNACC7D1u-pwCyPvzlMBdfHJ3?usp=drive_link

nv-nguyen commented 1 year ago

I added stability_score_thresh parameter in inference_custom.py script so that the segmentation model can output more unstable masks (it usually happens for very tiny objects like yours). Here is the result after fine-tuning this parameter of SAM:

result

Here are the commands to reproduce the results:

export CAD_PATH=./media/demo2/NeedleHolder.ply
export RGB_PATH=./media/demo2/ThreeToolTest.png
export OUTPUT_DIR=./tmp/custom_dataset
python -m src.scripts.inference_custom --template_dir $OUTPUT_DIR --rgb_path $RGB_PATH --stability_score_thresh 0.5

I shared your example in this repo so that new users can be aware of it. Please let me know if you don't want to share this example. Thanks!

jakob-ropers-snkeos commented 1 year ago

Thank you so much for he very quick response and great solution! Yes I am okay with this being used as an example!

jakob-ropers-snkeos commented 1 year ago

I can confirm that this change makes it work perfectly on all my tools, not just the needle holder!