Open Yuqifan1117 opened 1 year ago
Hello @Yuqifan1117,
Congratulations on creating a project that combines SAM and Grounded DINO for an automatic labeling pipeline using only a single label word! It's great to see the creative application of these cutting-edge models in new ways.
By utilizing Grounded DINO and SAM, your pipeline generates multiple annotations automatically and disambiguates them using CLIP scores to obtain high-quality labels for various categories. This approach can significantly streamline the labeling process, especially in situations where manual annotation is time-consuming or challenging.
If you're looking for feedback or discussions, you can consider posting your project to relevant forums or subreddits where the machine learning community can engage with your work. Some examples include:
r/MachineLearning r/learnmachinelearning r/computervision AI Stack Exchange Sharing your project on these platforms can help you get valuable insights, suggestions, and feedback from experts and enthusiasts in the field. Don't forget to provide a clear explanation of your project, its goals, and any challenges you faced or would like to discuss.
Good luck with your project, and I hope you receive valuable feedback and discussions!
This is amazing work @Yuqifan1117, I just read that SAM does not provide textual labeling of masks and was thinking of creating an automated pipeline that can fix bad quality images from Stable diffusion using SAM and labels to detect too many hands/fingers, eyes/face and lack of desired objects in image with desired angles etc! then I searched the keyword "label" in this repo hopping a discussion on when SAM will allow labels and found this gem, amazing!
Annotation anything now with only label word sets! Hi! Here I implemented a project combined SAM and Grounded DINO for automantic labelling pipeline with only one label word (https://github.com/Yuqifan1117/Annotation-anything-pipeline). Welcome for any discussion~ Given the target label set with arbitrary categories, we ultilize GroundedDINO and SAM to obtain multiple annotations automantically and disambiguate them by clip scores to obtain high quality labels.