Question on Cutie's Arxiv paper equation S2

hkchengrex / Cutie

[CVPR 2024 Highlight] Putting the Object Back Into Video Object Segmentation

https://hkchengrex.com/Cutie/

MIT License

691 stars 69 forks source link

Closed zzzc18 closed 3 months ago

zzzc18 commented 3 months ago

I'm confused about this S2 equation, seems it's not equivalent to S1. And I noticed a comment in object_transformer.py says:

# during inference, T=1 as we already did streaming average in memory_manager

I'm wondering if it would be better to remove the T in S2 or specify as the current timestep? (If I'm understanding correctly)

hkchengrex commented 3 months ago

They are both computing averages. T=1 in (S2).

zzzc18 commented 3 months ago

Thanks for explaining!