gaomingqi / Track-Anything

Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
MIT License
6.52k stars 481 forks source link

X-mem annotation question. #54

Open mikearney opened 1 year ago

mikearney commented 1 year ago

In the Steph Curry example, how many different manual annotations needed to be made before processing? I am attempting to process a similar video, but can not keep consistency of a person between different camera cuts

memoryunreal commented 1 year ago

For the different camera cuts, you could try the "Image Selection" and "Track end frames" sliders. For cut #1, assume 1-100 frames, you could add mask in the first frame and put the "track end frames" at 100, then click the tracking button. For cut #2, you could put the image selection button to #101 as the initial tracking frame and set the "track end frames" at the end of cut #2. For the #101 frame, you could add a new mask (if track different targets in different camera cuts. For Steph Curry example, you need to remove the mask first and then add a new mask to ensure the mask_id is unchanged.) .Click "Tracking" button, and the target in cut#2 will be tracked. Also, the previous 100 frames' masks will be saved. Repeat the same operation in subsequent camera cuts.

mikearney commented 1 year ago

Understood. So Xmem is not tracking the same subject across camera cuts and a new selection must be set manually for each cut.