I am using an MacBook Pro M2 and when doing /llama-llava-cli -m ../MobileVLM-1.7B/ggml-model-q4_k.gguf \
--mmproj ../MobileVLM-1.7B/mmproj-model-f16.gguf \
--image ../paella.jpg \
-p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: \nWho is the author of this book? Answer the question using a single word or phrase. ASSISTANT:" I am getting this error:
ggml_metal_init: recommendedMaxWorkingSetSize = 11453.25 MB
llama_kv_cache_init: Metal KV buffer size = 384.00 MiB
llama_new_context_with_model: KV self size = 384.00 MiB, K (f16): 192.00 MiB, V (f16): 192.00 MiB
llama_new_context_with_model: CPU output buffer size = 0.12 MiB
llama_new_context_with_model: Metal compute buffer size = 84.00 MiB
llama_new_context_with_model: CPU compute buffer size = 8.01 MiB
llama_new_context_with_model: graph nodes = 774
llama_new_context_with_model: graph splits = 2
ggml_metal_graph_compute_block_invoke: error: unsupported op 'HARDSWISH'
GGML_ASSERT: ggml/src/ggml-metal.m:934: !"unsupported op"
zsh: abort ./llama-llava-cli -m ../MobileVLM-1.7B/ggml-model-q4_k.gguf --mmproj --image
I am using an MacBook Pro M2 and when doing /llama-llava-cli -m ../MobileVLM-1.7B/ggml-model-q4_k.gguf \ --mmproj ../MobileVLM-1.7B/mmproj-model-f16.gguf \ --image ../paella.jpg \ -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER:\nWho is the author of this book? Answer the question using a single word or phrase. ASSISTANT:" I am getting this error:
ggml_metal_init: recommendedMaxWorkingSetSize = 11453.25 MB
llama_kv_cache_init: Metal KV buffer size = 384.00 MiB
llama_new_context_with_model: KV self size = 384.00 MiB, K (f16): 192.00 MiB, V (f16): 192.00 MiB
llama_new_context_with_model: CPU output buffer size = 0.12 MiB
llama_new_context_with_model: Metal compute buffer size = 84.00 MiB
llama_new_context_with_model: CPU compute buffer size = 8.01 MiB
llama_new_context_with_model: graph nodes = 774
llama_new_context_with_model: graph splits = 2
ggml_metal_graph_compute_block_invoke: error: unsupported op 'HARDSWISH'
GGML_ASSERT: ggml/src/ggml-metal.m:934: !"unsupported op"
zsh: abort ./llama-llava-cli -m ../MobileVLM-1.7B/ggml-model-q4_k.gguf --mmproj --image