kijai / ComfyUI-Florence2

Inference Microsoft Florence2 VLM
MIT License
768 stars 51 forks source link

referring_expression_segmentation and multiple segments #63

Open andrewn3 opened 2 months ago

andrewn3 commented 2 months ago

If I use "caption_to_phrase_grounding" using multiple inputs on the text prompt eg. bike, red car, it highlights the segments in the image and I can get the separate output_mask_select to work. Sometimes you need to be specific e.g. car won't select the car but red car will do.

However, if I use the same text input for "referring_expression_segementation", I can't get it to work to identify multiple segments, it only selects and highlights one of these. I've tried all different forms of prompts (e.g. bike(and)red car) or locate bike, red car in the image with mask). It works at separately highlighting segments one item at a time.

Think this is a bug.

Lilien86 commented 2 months ago

I have the same issue

Text input (prompt) is only supported for 'referring_expression_segmentation', 'caption_to_phrase_grounding', and 'docvqa'
Lilien86 commented 2 months ago

I found, int his video there is the answer (https://www.youtube.com/watch?v=BRST8-yPD5A) at the 5 minute mark

allenby168 commented 1 day ago

Choose the "referring_expression_segmentation" option to segment anything. It menas to choose the right option.