Support EdgeSAM in Grounded-SAM

In this PR, we support the newly proposed efficient segment anything model EdgeSAM in Grounded-SAM which forms Grounded-Edge-SAM

Note that, we did some hack in rep_vit.py to make it return more than the image features in order to be adapted to our hack predictor.py line91

sam-hq needs to return interm-features to enhance the mask decoder, so there will be more outputs than sam-model in predictor as:

self.features, self.interm_features = self.model.image_encoder(input_image)

so we modify the rep_vit as:

x = self.neck(x)
# hack this place because we modified the predictor of SAM for HQ-SAM in
# segment_anything/segment_anything/predictor.py line 91 to return intern features of the backbone
# self.features, self.interm_features = self.model.image_encoder(input_image)
return x, None

to add a place holder None to avoid the bug.

We will try our best to make the code cleaner to be used.

TODO List

[x] Add tutorial on grounded-edge-sam
[x] Update README.md

IDEA-Research / Grounded-Segment-Anything

Support EdgeSAM in Grounded-SAM #418

TODO List