hkchengrex / Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation
https://hkchengrex.com/Tracking-Anything-with-DEVA/
Other
1.23k stars 128 forks source link

Just want to confirm! #82

Closed JawadTawhidi closed 5 months ago

JawadTawhidi commented 5 months ago

Hi, I am citing DEVA in my paper. However would you please confirm or correct my explanation about DEVA's approach that is used for DAVIS-2017?

I want to exaplain like this:

For DAVIS-2017 (multi-object), EntitySeg is used as image segmentation model and simi-online protocol is followed. The semi-online protocol combines in-clip consensus with temporal propagation every 5 frames with a clip size of n=3. Specifically, the process starts by performing the initial in-clip consensus on 3 frames at the beginning of the video. The segmentation mask with the highest confidence, generated by image segmentation model, is chosen. This mask is then propagated for the 5 initial frames. Next, another 3 frames are selected, and in-clip consensus is performed again. The result for this in-clip consensus is merged with the temporal propagation from the previous frames, and the final result is propagated for the next 5 frames. These steps are repeated throughout the video.

hkchengrex commented 5 months ago

Duplicated #81