Closed alstonlo closed 1 month ago
We have internally worked on it, but not releasing the code as we haven't extensively tested it. We'd love the community's contributions! As a side note, conv1d for Hydra is not causal, but bidirectional, which is different from Mamba :)
Thanks, I have created a PR!
Awesome, thank you for the PR!
Thanks for merging!
Thank you! sorry for taking a while for the merge
Hi authors,
Thanks for all the amazing work!
I was wondering if there were any plans to support a memory-efficient implementation of Hydra, similar to the
mem_eff
path in Mamba2. As a workaround, I have written a preliminary implementation by extending themamba_split_conv1d_scan_combined()
function used in Mamba2. I would be excited to contribute it through a draft PR, if it would be helpful!