hkproj / mamba-notes

Notes on the Mamba and the S4 model (Mamba: Linear-Time Sequence Modeling with Selective State Spaces)
https://youtu.be/8Q_tqwpTpVU
148 stars 10 forks source link

Confusion in discretization process #1

Closed TokisakiKurumi2001 closed 4 months ago

TokisakiKurumi2001 commented 9 months ago

Hi there. This note is really helpful for me to understand about the Mamba. Thank you for such an amazing job you've done.

However, when looking at the notes, I stumble into a problem trying to the discrete formula.

From you notes, using the Euler Method, we will arrive the final formula as below.

$ht = \mathbf{\bar{A}}h{t-1} + \mathbf{\bar{B}}x_{t-1}$.

From that equation, I interpret that the new hidden state depends on the previous hidden state and the previous input x.

However, in the Mamba paper, I notice that the equation is slightly different.

$ht = \mathbf{\bar{A}}h{t-1} + \mathbf{\bar{B}}x_t$.

The term that changed here is $x_{t-1}$ become $x_t$.

Does it imply that using different algorithm (ZOH vs Euler) will result in different final equation? In addition, I found this Wikipedia link that states the equation different from the paper.

I'm a newbie here so if there is any mistake, please tell me know. Thank you.

TokisakiKurumi2001 commented 4 months ago

From this blog, it seems that under small $\Delta$, $xt \approx x{t+1}$. I think the authors make the substitution here.