Open Xia-zx opened 3 months ago
SS2Dv0 represent for the vanilla-VMamba in the arxiv paper, it just applied cross scan and cross merge into Mamba and added a Norm and the hieratical architecture into it.
SS2Dv2 represent for the VMamba, which adds sort of tricks into it to accelerate vanilla-VMamba while keeping its advantages.
There's no SS2Dv1 as I delete it.
SS2Dm0 represent for the support for Mamba2, but the code has not trained yet.
about SS2Dv3?
SS2Dv3 (forwardtype=xv1a...) is a more simplified version of SS2Dv2 and is faster. But the training seems unstable when scaled to base model.
thank you
may i ask where to switch the version of ss2d?
with different forward_type, you can use different settings of SS2D forward, including the class type of SS2D (e.g SS2Dv0, SS2Dv2, SS2Dv3)
what is the diffences of SS2Dv0,SS2Dv1,SS2DV2,SS2Dm0 in vmamba.py