Closed jakob-ropers-snkeos closed 1 year ago
Can you share the results here (+input images, renderings of CAD model)? In the custom inference script, I set confidence_score=0.5 by default. It may be helpful to use a lower threshold to visualize more detections.
From what you said “close-ups of the objects, the single object only, a white background, and no occlusion” but there is still no segmentation on objects are so weird since SAM or FastSAM should segment everything. You can reduce confidence_threshold to see all detections first and make sure that there are at least some masks on objects.
Thank you for your very fast reply!
So here is an example where it worked best for me. As you can see, it still does not segment the entire object:
When I do it for another tool, I only get the background for some reason:
I am confused because as soon as I use a CAD model and RGB image from one of the example datasets it segments it absolutely perfectly despite the object being far away from the camera in an extremely cluttered space. Lowering the confidence threshold did not help by the way. I appreciate your help!
CNOS should work on this example. Can you share the RGB image + CAD model so that I can try it?
Sure! Here are two example images and the CAD model of the needle holder:
https://drive.google.com/drive/folders/1pn7letIZNACC7D1u-pwCyPvzlMBdfHJ3?usp=drive_link
I added stability_score_thresh parameter in inference_custom.py script so that the segmentation model can output more unstable masks (it usually happens for very tiny objects like yours). Here is the result after fine-tuning this parameter of SAM:
Here are the commands to reproduce the results:
export CAD_PATH=./media/demo2/NeedleHolder.ply
export RGB_PATH=./media/demo2/ThreeToolTest.png
export OUTPUT_DIR=./tmp/custom_dataset
python -m src.scripts.inference_custom --template_dir $OUTPUT_DIR --rgb_path $RGB_PATH --stability_score_thresh 0.5
I shared your example in this repo so that new users can be aware of it. Please let me know if you don't want to share this example. Thanks!
Thank you so much for he very quick response and great solution! Yes I am okay with this being used as an example!
I can confirm that this change makes it work perfectly on all my tools, not just the needle holder!
I tried this code and it works beautifully on the provided datasets. Considering the shots are cluttered with all kinds of objects, there are occlusions, etc. I am extremely impressed by the performance on this. However, as soon as I move to a custom dataset, this performance is not repeatable whatsoever. I am trying this out on surgical tools, for which I have an accurate CAD model. The images I tried it on are close-ups of the objects, the single object only, a white background, and no occlusion; in other words, as simple as it gets and technically a perfect template match. However, the model either only predicts a tiny part of the object or just something completely wrong like the entire background (everything but the object). Could you maybe comment on the types of objects this works well on (the objects in your datasets seem a bit more bulky while the surgical tools are more skinny) or whether there are any tricks to improving this performance?