Closed jeromeku closed 9 months ago
Hi, I added a comment here: https://github.com/johnma2006/mamba-minimal/blob/master/model.py#L307 B uses a simplified Euler discretization instead of ZOH, which the authors say: "performance doesn't change much with the simplication on B" (from a discussion I had with Albert)
@johnma2006 Thanks for the clarification!
Thanks for the clear implementation!
Can you explain the discretization of $B$ in
selective scan
?Equation 4 in section 2 of the paper states $$\overline{B} = (\Delta A)^{-1} (exp((\Delta A) - I) \cdot \Delta B$$
In your implementation, the input is mapped into the hidden state by the following:
which if I understand correctly, implies that $\overline{B} = \Delta B$?