How to achieve the temporal-first scan and spatialtemporal scan v1 & v2?

Hi , thanks for your brilliant work! I saw you use 4 kinds of scan methods in Fig.4 in the paper. I guess only spatial-first bidirectional scan is used in the mamba_simple.py. I am intrigued by the other three scanning methodologies you've mentioned. Would it be possible to kindly share some guidance on how to implement the remaining methods—namely, the temporal-first scan, as well as the spatial-temporal scan versions 1 and 2? Your insights would be immensely valuable. Thank you very much for considering my request.

OpenGVLab / VideoMamba

How to achieve the temporal-first scan and spatialtemporal scan v1 & v2? #80