hkchengrex / XMem

[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
https://hkchengrex.com/XMem/
MIT License
1.72k stars 191 forks source link

Inference strategy #118

Closed Amshaker closed 1 year ago

Amshaker commented 1 year ago

Hello,

Thank you for your awesome work, I have a question regarding the inference strategy, please.

You mentioned in the paper that the training is based on a sequence of 8 frames.

During inference, what are the criteria? are you doing segmentation for every 8 consecutive frames? Please clarify the inference strategy.

hkchengrex commented 1 year ago

The model is online and always processes frame-by-frame. The 8-frame limit during training is only for resources/implementation consideration. During inference, we process the entire video frame-by-frame (no reason not to).

Amshaker commented 1 year ago

Thanks for your reply.