gaomingqi / Track-Anything

Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
MIT License
6.41k stars 472 forks source link

Integrate MobileSAM into the pipeline for lightweight and faster inference #97

Open qiaoyu1002 opened 1 year ago

qiaoyu1002 commented 1 year ago

Reference: https://github.com/ChaoningZhang/MobileSAM

Our project performs on par with the original SAM and keeps exactly the same pipeline as the original SAM except for a change on the image encode, therefore, it is easy to Integrate into any project.

MobileSAM is around 60 times smaller and around 50 times faster than original SAM, and it is around 7 times smaller and around 5 times faster than the concurrent FastSAM. The comparison of the whole pipeline is summarzed as follows:

image

image

moritzbrantner commented 1 year ago

Yes, this would be awesome!

qiaoyu1002 commented 1 year ago

Hello,

Thank you for your effort.

Except for replacing the SAM_checkpoint, still need to add some code in the official SAM code.

Could you please refer to this pull request? https://github.com/ChaoningZhang/MobileSAM/pull/26/files

On Tue, Jul 4, 2023 at 2:53 AM K-Maehashi @.***> wrote:

Hi @qiaoyu1002 https://github.com/qiaoyu1002, I tried to use MobileSAM by replacing the SAM_checkpoint in app.py (line 377), but I got the following errors. Do you have any idea to make it work?

Error:

size mismatch for image_encoder.neck.0.weight: copying a param with shape torch.Size([256, 320, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 1280, 1, 1]).

— Reply to this email directly, view it on GitHub https://github.com/gaomingqi/Track-Anything/issues/97#issuecomment-1618947139, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYPVVX5UTKS5PJIUGN3VWA3XOMBK7ANCNFSM6AAAAAAZWPQY4E . You are receiving this because you were mentioned.Message ID: @.***>