ziqi-jin / finetune-anything

Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios
MIT License
790 stars 56 forks source link

Further details #51

Open 25benjaminli opened 11 months ago

25benjaminli commented 11 months ago

Hi, will there be a paper for this or other documents describing all the changes made to the original SAM architecture?

25benjaminli commented 11 months ago

@ziqi-jin Also, I looked into the code and found the following in extend_sam/extend_sam.py.

def forward(self, img):
        x = self.img_adapter(img)
        points = None
        boxes = None
        masks = None

        sparse_embeddings, dense_embeddings = self.prompt_adapter(
            points=points,
            boxes=boxes,
            masks=masks,
        )

Does this mean that finetune-anything does not by default use any prompts? If so, would it be viable to implement automatic box generation for each class? And would it be beneficial to the algorithm performance