-
There has been a completed merge of mamba model support over at Ilama.ccp, would it be possible to implement these into Ollama as well?
Merged PR: https://github.com/ggerganov/llama.cpp/pull/5328
…
-
# Mamba: Selective State Space Modeling | Nathan's Notes
An introduction to Mamba models: faster and better* than transformers
[https://nathanzhao.cc/mamba](https://nathanzhao.cc/mamba)
-
You have included Mamba in the leaderboard and survey paper, but this model is not intended for time series analysis; it is better suited for language modeling. There are several state-space models sp…
-
Hi, I would like to ask how to convert a mamba2 model to onnx inference. When I try to convert, I encounter an error: cols = tl.array(0, BLOCK_N), where BLOCK_N=min(MAX_FUSED_SIZE, triton.next_power_o…
-
Hello, it always shows `Mamba.__init__() got an unexpected keyword argument layer_idx`
I used the solution, but it not works.(https://github.com/state-spaces/mamba/issues/315).
The problem has b…
-
### System Info
Python 3.10.13, CUDA 12.1
GPU = NVIDIA GeForce RTX 2080 Ti. Max memory = 10.747 GB.
torch==2.2.1
torchaudio==2.1.0
torchvision==0.16.0
tokenizers==0.15.2
transformers ==git+ht…
-
Have you ever met a situation where the loss began to be nan after a few iters when changing the MambaV2 to the "non-causal" form?
I have also tried to change "IS_CAUSAL" to False, but this doesn't…
-
### System Info
transformers.version=4.44.2
### Reproduction
1. Run script:
```
config = AutoConfig.from_pretrained('state-spaces/mamba-130m')
model = MambaForCausalLM(config)
model.to(de…
-
Hi, when running example inference on Mamba2:
```
python benchmarks/benchmark_generation_mamba_simple.py --model-name "state-spaces/mamba2-2.7b" --prompt "My cat wrote all this CUDA code for a new …
-
When running
`python evals/lm_harness_eval.py --model mamba --model_args pretrained=state-spaces/mamba-130m --tasks lambada_openai,hellaswag,piqa,arc_easy,arc_challenge,winogrande --device cuda --…