facebookresearch / segment-anything-2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Apache License 2.0
10.72k stars 863 forks source link

[Project] Grounded SAM 2 Release #130

Open rentainhe opened 1 month ago

rentainhe commented 1 month ago

We combine Grounding DINO, Grounding DINO 1.5 and SAM 2 for tracking any object in the input video and we've open-sourced our code here: Grounded SAM 2

In this repo, we've supported:

We will update our code in the future release to support more demos for better usage.

A simple tracking video demo is as follows:

https://github.com/user-attachments/assets/8ebfa5de-3eac-43c5-b8e2-49160c9df786

ronghanghu commented 1 month ago

Hi @rentainhe, thanks for the great work!

Regarding the box prompts

We've noticed that the SAM 2 video predictor does not support box prompts now, so we've implement a simple uniformly positive point prompts sampling method based on the SAM 2 image predictor to support box prompts in video tracking demo, refer to our code for more details

We just added an example in the video predictor notebook in https://github.com/facebookresearch/segment-anything-2/pull/174 to provide a box prompt example. Maybe we could directly use the box prompt in this case?

rentainhe commented 1 month ago

Hi ronghang! We've already updated SAM 2 to the latest version and support box/point/mask prompts in video object tracking demo!

ronghanghu commented 1 month ago

@rentainhe Great, thanks for the quick update!