Open andrewn3 opened 2 months ago
I have the same issue
Text input (prompt) is only supported for 'referring_expression_segmentation', 'caption_to_phrase_grounding', and 'docvqa'
I found, int his video there is the answer (https://www.youtube.com/watch?v=BRST8-yPD5A) at the 5 minute mark
Choose the "referring_expression_segmentation" option to segment anything. It menas to choose the right option.
If I use "caption_to_phrase_grounding" using multiple inputs on the text prompt eg. bike, red car, it highlights the segments in the image and I can get the separate output_mask_select to work. Sometimes you need to be specific e.g. car won't select the car but red car will do.
However, if I use the same text input for "referring_expression_segementation", I can't get it to work to identify multiple segments, it only selects and highlights one of these. I've tried all different forms of prompts (e.g. bike(and)red car) or locate bike, red car in the image with mask). It works at separately highlighting segments one item at a time.
Think this is a bug.