-
Hello, I have been using unsloth for my fine-tuning purposes and am really enjoying the framework so far!
I just wanted to know if you could add support for loading and training state space models l…
-
Hi, I am benchmarking inference speed on long sequences and encountering CUDA-related errors specifically with the Mamba2 models at longer sequence lengths (>200k). This issue does not occur with Mamb…
-
Hello AnFreTh,
Thank you for your work on this project. I am currently using Mambular to process tabular data, but I am experiencing very slow training speeds. On average, each epoch is taking arou…
-
i have set up the environmet succesfully, but when i run `lm_eval --model mamba_ssm --model_args pretrained=state-spaces/mamba-130m --tasks lambada_openai,hellaswag,piqa,arc_easy,arc_challenge,winogra…
-
### Feature Description
New 7B coding model just released by Mistral.
- **Blog Post**: https://mistral.ai/news/codestral-mamba/
- **HF**: https://huggingface.co/mistralai/mamba-codestral-7B…
-
### Python -VV
```shell
(codestral) ➜ dev python -VV
Python 3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0]
```
### Pip Freeze
```shell
(codestral) ➜ dev pip freeze
absl-py==2.1.0
addict==…
-
This page is accessible via [roadmap.vllm.ai](https://roadmap.vllm.ai)
### Themes.
As before, we categorized our roadmap into 6 broad themes: broad model support, wide hardware coverage, state of…
-
I'm trying to train mamba2 130m from scratch.
```
config = Mamba2Config(
vocab_size=len(tokenizer.vocab),
n_positions=10,
n_embd=768,
…
-
Hi!
I've been exploring the Mamba architecture with great interest, especially its computational efficiencies compared to traditional Transformer models. The selective state space approach and the …
-
Dear Mamba Contributors,
I hope this message finds you well. I am in the process of utilising the Mamba state space architecture for a language modelling task and have been highly impressed with th…