segment anything model for text-guided image editing

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

https://huggingface.co/docs/diffusers

Apache License 2.0

25.32k stars 5.25k forks source link

segment anything model for text-guided image editing #3019

Closed feizc closed 1 year ago

feizc commented 1 year ago

Model/Pipeline/Scheduler description

Use recently release segment anything model (SAM) to support image editing. Specifically, SAM support click or text to segment the target region, which is used to create mask image for image editing. Compared with plot-based, its interactive form is more intelligent and convenient.

Code reference: https://github.com/feizc/IEA https://github.com/IDEA-Research/Grounded-Segment-Anything https://github.com/facebookresearch/segment-anything

Open source status

[X] The model implementation is available
[X] The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

No response

0xbitches commented 1 year ago

This is a cool idea. It looks like the first repo is using CLIP+SAM, and according to some anecdotal evidence, this architecture does not appear to work too well. I found https://github.com/IDEA-Research/Grounded-Segment-Anything (GroundingDINO+SAM), which claims the inpainting results are better.

patrickvonplaten commented 1 year ago

Hey @feizc,

The SAM model is really cool and it's great to see that https://github.com/IDEA-Research/Grounded-Segment-Anything uses it with diffusers. We could maybe add a community pipeline to showcase this pipeline. It might however be a bit too high-level to implement as a "core" pipeline. cc @sayakpaul

whbzju commented 1 year ago

Hey @feizc,

The SAM model is really cool and it's great to see that https://github.com/IDEA-Research/Grounded-Segment-Anything uses it with diffusers. We could maybe add a community pipeline to showcase this pipeline. It might however be a bit too high-level to implement as a "core" pipeline. cc @sayakpaul

glad to here any new progressing?

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten commented 1 year ago

cc @sayakpaul would be nice to have such a community pipeline / hf agent tool

sayakpaul commented 1 year ago

I am not aware of a community pipeline yet. But it should be possible to extend this notebook to an HF agent tool.

andysingal commented 1 year ago

Hey @feizc,

The SAM model is really cool and it's great to see that https://github.com/IDEA-Research/Grounded-Segment-Anything uses it with diffusers. We could maybe add a community pipeline to showcase this pipeline. It might however be a bit too high-level to implement as a "core" pipeline. cc @sayakpaul

@patrickvonplaten @sayakpaul Any updates on this pipeline. I am trying to use GroundingDINO with Stable diffusion . While trying on a custom dataset i got the following error: Issue: https://github.com/IDEA-Research/GroundingDINO/issues/184 Colab: https://colab.research.google.com/drive/1uObAs_PI6caKHhwZRN3mQSzkc3zxrkeR?usp=sharing Dataset: https://www.kaggle.com/datasets/abbymorgan/animals-toy-dataset

patrickvonplaten commented 1 year ago

Hey @andysingal,

I really appreciate your motivation, but many of the issues that you post are far away from issue reporting, but sound more like seeking help for personal projects.

The issue tracker of open-source libraries should ideally only be used for issue reporting so that the library can improve. At the moment your issues take too much attention way from the core maintainers (who have to review 200+ issues / comments every day) from issues that improve the library.

Could you please try to instead use the forum: https://discuss.huggingface.co/ or discord: https://discord.com/invite/G7tWnz98XR for such questions?