facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Apache License 2.0
12.38k stars 1.14k forks source link

Recommended way to add new points to track later in the video? #224

Open Caspeerrr opened 3 months ago

Caspeerrr commented 3 months ago

In my usecase I don't know all objects I want to track in the first frame, but they may occur at any frame in the video. So I need to be able to flexibly add new points/bboxes to track. As far as I can see there are currently two obvious approaches:

Does anyone have any additional insights?

Thanks!

ronghanghu commented 3 months ago

Hi @Caspeerrr, I would recommend using the 2nd approach of "Create a separate inference state for each object I want to track" as you mentioned above.

Currently the codebase doesn't support adding new objects after tracking, primarily because it performs inference by batching multiple objects together, while new objects added later don't have memory or other previous states and cannot be directly batched together. Tracking them with separate inference states could be a workaround to this issue.

melodyhappy commented 2 months ago

Can we avoid initializing a new predictor and instead directly add new objects during the tracking process, such as aligning the information of newly appeared objects with the existing ones by padding in temporal dimensions?