$ ./build/bin/llama-bench -m ~/Downloads/mamba-2.8b-q4_0.gguf -ngl 0
| model | size | params | backend | threads | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | ------------: | -------------------: |
/Users/molly/llama.cpp/ggml/src/ggml-backend.cpp:745: pre-allocated tensor in a backend that cannot run the operation
[1] 13345 abort ./build/bin/llama-bench -m ~/Downloads/mamba-2.8b-q4_0.gguf -ngl 0
$ ./build/bin/llama-bench -m /Volumes/grouped/Models/rwkv/v6-Finch-7B-HF/v6-Finch-7B-HF-Q4_0.gguf -ngl 0
| model | size | params | backend | threads | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | ------------: | -------------------: |
/Users/molly/llama.cpp/ggml/src/ggml-backend.cpp:745: pre-allocated tensor in a backend that cannot run the operation
[1] 16003 abort ./build/bin/llama-bench -m -ngl 0
Using lldb to trace the error, it fails in ggml_backend_sched_backend_id_from_cur
if (tensor->buffer || (tensor->view_src && tensor->view_src->buffer)) {
// since the tensor is pre-allocated, it cannot be moved to another backend
GGML_ABORT("pre-allocated tensor in a backend that cannot run the operation");
}
, where the tensor triggering this fault was a view of cache_k_l0. This makes sense, as both mamba and rwkv do a GGML_VIEW/GGML_RESHAPE on the k cache when building the graph.
CC @compilade
Name and Version
Non-working version:
$ ./build/bin/llama-cli -v
build: 4098 (772703c8) with Apple clang version 16.0.0 (clang-1600.0.26.4) for arm64-apple-darwin24.1.0
Known working version:
$ ./build/bin/llama-cli -v
build: 4079 (4a8ccb37) with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.6.0
What operating system are you seeing the problem on?
What happened?
Using lldb to trace the error, it fails in
ggml_backend_sched_backend_id_from_cur
, where the tensor triggering this fault was a view of
cache_k_l0
. This makes sense, as both mamba and rwkv do a GGML_VIEW/GGML_RESHAPE on the k cache when building the graph.CC @compilade
Name and Version
Non-working version: $ ./build/bin/llama-cli -v build: 4098 (772703c8) with Apple clang version 16.0.0 (clang-1600.0.26.4) for arm64-apple-darwin24.1.0
Known working version: $ ./build/bin/llama-cli -v build: 4079 (4a8ccb37) with Apple clang version 15.0.0 (clang-1500.3.9.4) for arm64-apple-darwin23.6.0
What operating system are you seeing the problem on?
Linux, Mac
Relevant log output
No response