I’m trying to use this crate for a project of mine. But it crashes as soon as a model has been loaded.
Expected behaviour
After loading the model, I can use it.
Actual behaviour
The model loads, but crashes immediately with the following output:
Loaded tensor 8/201
Loaded tensor 16/201
Loaded tensor 24/201
Loaded tensor 32/201
Loaded tensor 40/201
Loaded tensor 48/201
Loaded tensor 56/201
Loaded tensor 64/201
Loaded tensor 72/201
Loaded tensor 80/201
Loaded tensor 88/201
Loaded tensor 96/201
Loaded tensor 104/201
Loaded tensor 112/201
Loaded tensor 120/201
Loaded tensor 128/201
Loaded tensor 136/201
Loaded tensor 144/201
Loaded tensor 152/201
Loaded tensor 160/201
Loaded tensor 168/201
Loaded tensor 176/201
Loaded tensor 184/201
Loaded tensor 192/201
Loaded tensor 200/201
Loading of model complete
Model size = 745.81 MB / num tensors = 201
ggml_new_object: not enough space in the context's memory pool (needed 184549744, available 2097152)
zsh: segmentation fault cargo run
I tried a few different models, all showed the same behaviour.
Setup
OS: MacOS 14.1.4 (M1)
Rust version: 1.76.0
Crate: I tried the main branch as well as the gguf branch with the same results.
Edit: I found the ModelParameters struct with the context_size field. Unfortunately, increasing this value doesn’t change a thing about the error, even the displayed available memory stays exactly the same.
I also tried to set prefer_mmap to false, as this is suggested for resource constrained environments. This actually gets rid of the aforementioned error but instead throws a “non-specific I/O error”.
Edit 2: Actually, reducing the context_size decreases the “needed” memory. I got it to almost reach the “available” memory value by setting it very low, but then it went up again:
context_size: 24 leads to needed 2163056, available 2097152,
cobtext_size: 20 leads to needed 3605248, available 2097152,
context_size: 16 leads to needed 2884352, available 2097152,
context_size: 12 leads to needed 2163456, available 2097152,
context_size below 12 leads to error Failed to ingest initial prompt.: ContextFull.
Hi!
I’m trying to use this crate for a project of mine. But it crashes as soon as a model has been loaded.
Expected behaviour
After loading the model, I can use it.
Actual behaviour
The model loads, but crashes immediately with the following output:
I tried a few different models, all showed the same behaviour.
Setup
OS: MacOS 14.1.4 (M1) Rust version:
1.76.0
Crate: I tried the main branch as well as the gguf branch with the same results.Edit: I found the
ModelParameters
struct with thecontext_size
field. Unfortunately, increasing this value doesn’t change a thing about the error, even the displayed available memory stays exactly the same. I also tried to setprefer_mmap
tofalse
, as this is suggested for resource constrained environments. This actually gets rid of the aforementioned error but instead throws a “non-specific I/O error”.Edit 2: Actually, reducing the
context_size
decreases the “needed” memory. I got it to almost reach the “available” memory value by setting it very low, but then it went up again:context_size: 24
leads toneeded 2163056, available 2097152
,cobtext_size: 20
leads toneeded 3605248, available 2097152
,context_size: 16
leads toneeded 2884352, available 2097152
,context_size: 12
leads toneeded 2163456, available 2097152
,context_size
below 12 leads to errorFailed to ingest initial prompt.: ContextFull
.