Closed Links17 closed 2 months ago
When I use nvidia Jetson AGX orin, cuda 8.7 to run LLaVA, he is very slow, it looks like it's not using the GPU but the CPU!
it'll take about 20s.
wasmedge --dir .:. --env mmproj=mmproj-model-f16.gguf --env image=monalisa.jpg --env n-gpu-layers=35 --env threads=40 --nn-preload default:GGML:AUTO:ggml-model-q5_k.gguf wasmedge-ggml-llava.wasm default
Hi @Links17 You should use n_gpu_layers instead.
n_gpu_layers
See: https://github.com/second-state/WasmEdge-WASINN-examples/blob/master/wasmedge-ggml/llava/src/main.rs#L50
Thanks a lot for the answer, it's my problem.
When I use nvidia Jetson AGX orin, cuda 8.7 to run LLaVA, he is very slow, it looks like it's not using the GPU but the CPU!
it'll take about 20s.