Open rentainhe opened 1 year ago
We can further combine Grounded-Segment-Anything with Diffusion Models for Inpainting, which means we can label and generate high quality new data (with box and mask annotation) with this pipeline!
excellent~
nice~
I come again ...
It's an excellent work above via grounding box. While we provide a simpler solution via CLIP's explainability.
Our work can achieve text to mask with SAM using CLIP model only, without any fine-tuning or extra supervisions to generate the boxes:. https://github.com/xmed-lab/CLIP_Surgery
Besides, it enhances many open-vocabulary tasks, like segmentation, multi-label classification, multimodal visualization.
This is the jupyter demo: https://github.com/xmed-lab/CLIP_Surgery/blob/master/demo.ipynb
Hi! Thanks for releasing such impressive work! We find an interesting extension for this great work by combining SoTA zero-shot detector with Segment-Anything which can generate high-quality box and mask annotations with text inputs! The new project is here, we simply named it Grounded-Segment-Anything: https://github.com/IDEA-Research/Grounded-Segment-Anything
We take Grounding-DINO as the zero-shot detector to generate box prompts for segment-anything, and our visualization results are as follows:
We hope to maintain this project as a sub-project of segment-anything. We're also explore to combing Grouned-SAM with diffusion models for controllable image editing as well~
More Examples