johnma2006 / mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
Apache License 2.0
2.62k stars 191 forks source link

Parameter Initialization #17

Closed dashstander closed 7 months ago

dashstander commented 9 months ago

Hey! Thanks so much for putting this together. You mention that the parameter initialization isn't correct, can you point me towards what it should be? I've tried looking at the official repo and didn't see any out-of-the-ordinary stuff.

johnma2006 commented 9 months ago

So sorry for the late reply! I didn’t look carefully into the official repo’s initialization scheme, so I just assumed that it differed from my repo. But if you noticed they are in fact equal, then I’ll take a closer look in a bit and remove my disclaimer. Thank you!