Closed Andron00e closed 4 months ago
Hi! We have results for Jamba (hybrid base model with mamba and transformer) in our paper. We also evaluated this mamba model https://huggingface.co/state-spaces/mamba-2.8b-slimpj which doesn't train with long sequence length. If you have interests in models related to Mamba, I recommend you read this paper https://arxiv.org/pdf/2406.07887. They have results using RULER!
Hi! Are there any results available for State Spaces?