GGML_ASSERT: /build/go-llama/llama.cpp/ggml-cuda.cu:6670: src0->type == GGML_TYPE_F16

My operating system is Centos7 I download and use quay.io/go-skynet/local-ai:master-cublas-cuda12 So, I installed CDUA on the operating system that matches this image I user ggml-model-q4_0.gguf（Llama2-13B-chat） yaml file in the model folder I use Debug mode to run the image of LocalAI An error occurred when I used Postman to send inference The error message for LocalAI is as follows

I can reason normally by running llama.cpp separately in the container But there was an error running the test under go lama ...... @deadprogram @mauromorales @jrc2139 @soleblaze

go-skynet / go-llama.cpp

GGML_ASSERT: /build/go-llama/llama.cpp/ggml-cuda.cu:6670: src0->type == GGML_TYPE_F16 #261