z-x-yang / Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
GNU Affero General Public License v3.0
2.75k stars 332 forks source link

Choice of SAM checkpoint #126

Closed smandava98 closed 8 months ago

smandava98 commented 8 months ago

Hi.

Why was the choice ViT-B instead of L or H for SAM?

Also, in attempts to play around with this model and test it out, I found that the H version was not as good at segmenting here as it normally is (it colors the background stuff over any of the foreground objects). Is there any place in the codebase to fix this?