z-x-yang / Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
GNU Affero General Public License v3.0
2.75k stars 332 forks source link

Add audio-grounding feature using the AST model(fine-tuned on AudioSet) #143

Closed little612pea closed 4 months ago

little612pea commented 4 months ago

hjy added Audio Grounding feature on SAM-track using the AST model(fine-tuned on AudioSet)