Open senlin-ali opened 1 year ago
eeee... you can see this https://github.com/IDEA-Research/Grounded-Segment-Anything
Prior to proposing our new method, we experimented with grounded DINO on the FSC147 dataset and found that it frequently detects all objects of the same class using a single large bounding box, especially when dealing with small objects. It would be great if you can revisit this approach and kindly share your results here.
In an experiment using Grounding DINO and Grounded-SAM with a dataset containing (69 x 47 px) objects, the results are consistent with @shizenglin 's findings. For clarification, instead of identifying a single animal (the "object"), both (Grounding-DINO and Grounded-SAM) identified the whole herd as a single large bounding box. Based on this, there is a need of new methods that tackle these challenging scenarios.
what is grounding-sam and what do you mean by saying "prob not better than grounding-sam"? sorry, cannot fully understand your question. Could you expalin more?