facebookresearch / segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Apache License 2.0
46.99k stars 5.56k forks source link

SOTA Model for Text Prompt Segmentation #575

Open xiaobanni opened 1 year ago

xiaobanni commented 1 year ago

I am looking for a state-of-the-art (SOTA) model for text prompt segmentation. Currently, I am aware of two choices: Grounded-Segment-Anything and SEEM. However, both of these models fail to meet my requirements.

Consider the following example: I want the model to segment the lane lines, but the results from the aforementioned methods are as follows (i hope they can segment the lane line in the road):

Grounded-Segment-Anything:

image

SEEM Model:

image

Unfortunately, neither of them can solve this problem effectively. I would greatly appreciate any recommendations you may have.

Any information regarding the timeline for the release of SAM text-prompt capabilities would be welcome.

emi-dm commented 1 year ago

I recommend you this: https://github.com/luca-medeiros/lang-segment-anything

xiaobanni commented 1 year ago

Thank you for the recommendation. However, I have tried it and found that it is just an easier-to-read version of Grounded-Segment-Anything. It uses the same method of using GroundingDINO to translate the text prompt to a box prompt and then sending it to SAM, resulting in similar outcomes to the Grounded-Segment-Anything mentioned earlier. I believe that an oriented text prompt segment model (rather than the two-stage invoking) is necessary to address the issue at hand and facilitate broader downstream applications.

TerryYiDa commented 12 months ago

I am looking for a state-of-the-art (SOTA) model for text prompt segmentation. Currently, I am aware of two choices: Grounded-Segment-Anything and SEEM. However, both of these models fail to meet my requirements.

Consider the following example: I want the model to segment the lane lines, but the results from the aforementioned methods are as follows (i hope they can segment the lane line in the road):

Grounded-Segment-Anything: image

SEEM Model: image

Unfortunately, neither of them can solve this problem effectively. I would greatly appreciate any recommendations you may have.

Any information regarding the timeline for the release of SAM text-prompt capabilities would be welcome.

Do you have any good solutions? I'm facing the same problem now

xiaobanni commented 12 months ago

@TerryYiDa No. So, I hope this issue can track the progress of the advanced text-prompt segmentation model.

iacopo97 commented 10 months ago

I have the same problem, do you find a solution?

YuetianW commented 9 months ago

Lol, really wish it was possible to open up the ability to use text prompts . A two-stage approach like Grounded-Segment-Anything is neither useful nor elegant.😣

muhammadsr commented 6 months ago

Anyone made progress with this issue?