Three probs about VMamba

Dear authors:

  Thanks for your continuous work!!! I would like to consult you for three probs.

  First, I am so glad to see that you have updated the VMamba (VM) for higher **Throughput** by using "v4" & "LN2D"
          And I have tested the latest version of VM compared with ViT/B16 (ViT) on **Throughput**
           I found that the VM-Ting is much faster than ViT, which has a 1.5~1.7x Throughput compared with ViT.
           But it is a pity that VM-Small is a little bit slower than ViT, 
           and VM-Base is much slower than ViT by nearly 35%~40%
           So, is there any other possibility to fasten your VMamba , further improve the Throughput and make it a Truely high-accuracy & 
   high-throughput foundation model for  CV tasks ? It will be awesome 👍

  Second, would you like to test your model's accuracy on Imagenet1k after pretrained on Imagenet21k and released the ckpt ? :)

  Third, since Mamba is more capable of Long Sequence,  I wonder that whether increasing the EMBED_DIM will boost the performance while incuring minor impact on throughput, e.g. increase the VM-Tiny's EMBED_DIM from 96 to 128 ?

MzeroMiko / VMamba

Three probs about VMamba #122