facebookresearch / segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Apache License 2.0
46.76k stars 5.54k forks source link

how to get predicted masks classes? #27

Open lucasjinreal opened 1 year ago

lucasjinreal commented 1 year ago

Does every masks have a semantic classes group index or name?

zaiquanyang commented 1 year ago

It seems a kind of pixels clustering.

Does every masks have a semantic classes group index or name?

lucasjinreal commented 1 year ago

@zaiquanyang clustering doesn't know exactly which part belongs to which class. For instance, I using text prompt to segment chair and cup, but how can I know which part belongs to cup?

zaiquanyang commented 1 year ago

@zaiquanyang clustering doesn't know exactly which part belongs to which class. For instance, I using text prompt to segment chair and cup, but how can I know which part belongs to cup?

SAM's output is class-agnostic and you may need an extra point-like prompt for certain object.

alexw994 commented 1 year ago

@zaiquanyang clustering doesn't know exactly which part belongs to which class. For instance, I using text prompt to segment chair and cup, but how can I know which part belongs to cup?

SAM's output is class-agnostic and you may need an extra point-like prompt for certain object.

so that means I need to put the class names in the prompts. If I want to segment 80 classes, do I need to infer the SAM 80 times?

zaiquanyang commented 1 year ago

@zaiquanyang clustering doesn't know exactly which part belongs to which class. For instance, I using text prompt to segment chair and cup, but how can I know which part belongs to cup?

SAM's output is class-agnostic and you may need an extra point-like prompt for certain object.

so that means I need to put the class names in the prompts. If I want to segment 80 classes, do I need to infer the SAM 80 times?

I guess you only need infer one time. According to the paper, when given the text prompts, it is like previous zero-shot segmentation work and you can obtain the class name corresponding mask.

broullon commented 1 year ago

Hi Anyone know if it's possible to get the labels/class name for each mask? They are saying the dataset have 11 million images and generate 1.1 billion masks.

Jiaqi-Chen-00 commented 1 year ago

We have developed a project that provides an automated data annotation engine for the SA-1B dataset, which offers basic categories from COCO and ADE20K, as well as open-vocabulary category labeling based on image captions. However, we only provide the code and users must run it themselves to complete the annotation process.

Project: Semantic segment anything Repo: https://github.com/fudan-zvg/Semantic-Segment-Anything

jvpassarelli commented 1 year ago

Wrote a simple example notebook that allows you to generate overlays with provided classes. The labeling of the masks comes from CLIP. Maybe this is helpful!

Repo: https://github.com/jvpassarelli/sam-clip-segmentation