Closed EricPaul03 closed 7 months ago
In our paper, as can be seen in Fig.5, when the resolution is set to 128*128 - equating to a sequence length of 16384, the run-time performance of our Mamba block is comparable to that of self-attention. Regarding your question, you could use the default Mamba block by changing "v3" to "v1" to ascertain if our modifications were the cause of this issue.
In our paper, as can be seen in Fig.5, when the resolution is set to 128*128 - equating to a sequence length of 16384, the run-time performance of our Mamba block is comparable to that of self-attention. Regarding your question, you could use the default Mamba block by changing "v3" to "v1" to ascertain if our modifications were the cause of this issue.
Thank you so much, I will try it.
Hello, I've tried your mamba module in my project, but I found it is quite slower than self-attention, is it due to my sequence length(about 400) ?