Suggestion - Integrate MobileSAM into the pipeline for lightweight and faster inference

anuragxel / salt

Segment Anything Labelling Tool

MIT License

1.01k stars 126 forks source link

Thanks for your great tool, salt. We can add a comment through the experience implementing MobileSAM to our image annotation tool RectLabel.

MobileSAM was trained on auto generated annotations in Segment Anything 1 Billion (SA-1B) dataset. So that the mask area when you click a foreground point tends to be large. But, if you click multiple foreground/background points, the accuracy is almost the same as ViT-Base SAM.

The model size of MobileSAM is smaller so that the calculation speed is faster. We used the quantized ViT-Base SAM model as comparison.

28.1MB mobile_sam_preprocess.onnx 16.5MB mobile_sam.onnx

108.9MB sam_vit_b_01ec64_preprocess.onnx 8.8MB sam_vit_b_01ec64.onnx

When click a foreground point using MobileSAM.

スクリーンショット 2023-10-24 4 23 06

When click a foreground point using ViT-Base SAM.

スクリーンショット 2023-10-24 4 23 25

Please let us know your opinion.

anuragxel / salt

Suggestion - Integrate MobileSAM into the pipeline for lightweight and faster inference #47