TTT layer in Mamba backbone

test-time-training / ttt-lm-pytorch

Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

MIT License

1.01k stars 56 forks source link

Closed YongLD closed 7 hours ago

YongLD commented 1 week ago

Is there any related code for the TTT layer in the Mamba backbone? If not, what should be considered when mapping the ABC in Mamba to QKV in TTT?

karan-dalal commented 7 hours ago

You can take a look at our JAX codebase to see how our Mamba backbone is implemented. https://github.com/test-time-training/ttt-lm-jax