Closed hnyll closed 3 months ago
I am not a maintainer of this work. However, based on my understanding of SAM models, you typically need an image and a prompt (such as points, bounding boxes, etc.) which gets encoded by the prompt encoder. This is a core aspect of how they function. Specifically, in the case of MedSAM, I believe they require bounding box prompts.
Just wanted to share my observations. The project authors can probably confirm this.
Hi @hnyll ,
The direct answer is yes.
Thank @BalasubramanyamEvani for your great explanation.
i have a custom 2d dataset of images and gt have trained a model. do i absolutely need bbox for inference?