IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
https://arxiv.org/abs/2401.14159
Apache License 2.0
15.12k stars 1.4k forks source link

demo code for None classes #216

Open sivaji123256 opened 1 year ago

sivaji123256 commented 1 year ago

Hi @nomorewzx @fengxiuyaun , Thanks for great work. I was trying to run the latest demo script in colab. Since objects in our data are custom , class is returning None value in detections ouptut.Due to that , it's breaking without passing the detection outputs to SAM. How can I convert None to other label ? Any suggestions would be highly useful.

rentainhe commented 1 year ago

Hi @nomorewzx @fengxiuyaun , Thanks for great work. I was trying to run the latest demo script in colab. Since objects in our data are custom , class is returning None value in detections ouptut.Due to that , it's breaking without passing the detection outputs to SAM. How can I convert None to other label ? Any suggestions would be highly useful.

Thanks for reporting this issue, I've met the same bug as well, it's because of that the groundingdino will refine the output phrase like All clouds -> clouds and it will raise some index error in the following func in grounding.util.inference

    @staticmethod
    def phrases2classes(phrases: List[str], classes: List[str]) -> np.ndarray:
        class_ids = []
        for phrase in phrases:
            try:
                class_ids.append(classes.index(phrase))  # the phrase does not match any of the classes
            except ValueError:
                class_ids.append(None)
        return np.array(class_ids)
rentainhe commented 1 year ago

Hi @nomorewzx @fengxiuyaun , Thanks for great work. I was trying to run the latest demo script in colab. Since objects in our data are custom , class is returning None value in detections ouptut.Due to that , it's breaking without passing the detection outputs to SAM. How can I convert None to other label ? Any suggestions would be highly useful.

Thanks for reporting this issue, I've met the same bug as well, it's because of that the groundingdino will refine the output phrase like All clouds -> clouds and it will raise some index error in the following func in grounding.util.inference

    @staticmethod
    def phrases2classes(phrases: List[str], classes: List[str]) -> np.ndarray:
        class_ids = []
        for phrase in phrases:
            try:
                class_ids.append(classes.index(phrase))  # the phrase does not match any of the classes
            except ValueError:
                class_ids.append(None)
        return np.array(class_ids)

I will try to solve this issue by updating this func, if you have any ideas about solving this issue, you can try create a new PR to fix this! Thanks a lot!

We will also try to solve this these days~ stay tuned~

rentainhe commented 1 year ago

We've already solve it in a not that elegant way by fuzzy matching the class id and the output phrase in grounding.util.inference:

    @staticmethod
    def phrases2classes(phrases: List[str], classes: List[str]) -> np.ndarray:
        class_ids = []
        for phrase in phrases:
            try:
                # class_ids.append(classes.index(phrase))
                class_ids.append(Model.find_index(phrase, classes))
            except ValueError:
                class_ids.append(None)
        return np.array(class_ids)

    @staticmethod
    def find_index(string, lst):
        for i, s in enumerate(lst):
            if string.lower() in s.lower():
                return i
        return -1

If you have some other ways to solve this bug~ you can open a new PR for this @sivaji123256

sivaji123256 commented 1 year ago

Thanks for your response ,I will try this and see. Do we have any scripts/code to convert the bounding boxes into coco or pascal voc format i.e xml or json files so that we can use it for training and see how the performance and thus compare with the manually annotated data ?

rentainhe commented 1 year ago

Thanks for your response ,I will try this and see. Do we have any scripts/code to convert the bounding boxes into coco or pascal voc format i.e xml or json files so that we can use it for training and see how the performance and thus compare with the manually annotated data ?

We will try to update a tutorial about how to dump the inference results to COCO format in the future, maybe you can find such code in Robotflow Tutorial: https://colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/automated-dataset-annotation-and-evaluation-with-grounding-dino-and-sam.ipynb

ichrnkv commented 8 months ago

Hello,

Faced with same problem with box_annotator, solved like that:

non_none_mask = np.where(detections.class_id == None, False, True)
detections.xyxy = detections.xyxy[non_none_mask]
detections.confidence = detections.confidence[non_none_mask]
detections.class_id = detections.class_id[non_none_mask]