Open Eli-YiLi opened 1 year ago
Your work is really awesome! For your information, there is another solution, which requires CLIP only without any training or extra supervisions.
Our work can achieve text to mask with SAM: https://github.com/xmed-lab/CLIP_Surgery This is work is in the aspect of CLIP's explainability. It's able to guide SAM to achieve text to mask without manual points. Besides, it enhances many open-vocabulary tasks, like segmentation, multi-label classification, multimodal visualization.
This is the jupyter demo: https://github.com/xmed-lab/CLIP_Surgery/blob/master/demo.ipynb
This is our segmentaion results:
This is our heatmap:
Excellent Work! We will highlight it in our README!
Your work is really awesome! For your information, there is another solution, which requires CLIP only without any training or extra supervisions.
Our work can achieve text to mask with SAM: https://github.com/xmed-lab/CLIP_Surgery This is work is in the aspect of CLIP's explainability. It's able to guide SAM to achieve text to mask without manual points. Besides, it enhances many open-vocabulary tasks, like segmentation, multi-label classification, multimodal visualization.
This is the jupyter demo: https://github.com/xmed-lab/CLIP_Surgery/blob/master/demo.ipynb
This is our segmentaion results:![image](https://user-images.githubusercontent.com/49093246/232215782-3caac15b-e646-4bde-8b1d-ffedb96f20a1.png)
This is our heatmap:![image](https://user-images.githubusercontent.com/49093246/232216182-8a77793a-1090-4821-9722-57f077b7a182.png)