Open luoshuiyue opened 1 week ago
One thing to try if you haven't already is using the different models (i.e. large vs. base), since they behave differently and one might work better than the other in some cases (i.e. large isn't always the best). It's also worth checking the different mask outputs (from multimask), since sometimes there can be one good mask even if the rest aren't great.
I'd also recommend trying to use as few prompts as possible. From what I've seen, the quality of the output really starts to drop once there are lots of prompts. In the worst case, where the masking isn't getting everything needed, you could try masking different pieces separately (using just 1 or 2 prompts) and combining the masks afterwards if that works for your use case (though it is inconvenient...).
And lastly, if you haven't already tried it, box prompts sometimes work well for objects that have lots of distinct areas like the person in the picture (i.e. legs + shorts + shirt + arms etc.). For example, one box prompt (using the large model) does fairly well on the last picture at least:
Thanks. I change the base plus model and the results doesn't get better. I use bbox setting just by copying the code in jupyter notebook and putting it in for
loop, and the improvement is very small. So, I want to ask:
automatic_mask_generator_example.ipynb
, I want to get the mask of person in the middle:
How to get the result just as you show in the GIF in your previous reply?
That gif is a screencapture of using this script.
How to use the result of automatic_mask_generator_example.ipynb, I want to get the mask of person in the middle
I think it would be tricky to do with the auto mask generator alone. The default point grid covers the whole image and is going to pick up loads of stuff in the background that will make it hard to deal with, so you could try using a custom point_grid that is limited to the center of the image. You could also try adjusting the min_mask_region_area setting, to see if that can help to filter out 'small' masks.
If you don't mind bringing in other models, you could also try using an object (person) detector to at least get a bounding box around the person and use that to ignore all the masks outside. Or similarly, you could maybe use a depth prediction model to ignore any masks that come from parts of the image that are 'too far away' to be the person. Otherwise I think it's difficult to target specific objects with the auto mask generator, since the SAM models alone don't have a way to classify the segmentation results.
The following is the result I predicted, may I ask if there is any way to improve the result? I have adjusted mask_threshold to -1.0, -0.5,-0.2, and max_hole_area to 1, 20. None of these methods worked.