Previously the --ngl flag would never take effect for llama-bench
This change fixes this behavior, allowing the GPU to be used.
This change currently does not fix the outstanding bug mentioned in #577. That bug is that not all the GPU layers are offloaded. This is fixed by #534.
Previously the
--ngl
flag would never take effect forllama-bench
This change fixes this behavior, allowing the GPU to be used.
This change currently does not fix the outstanding bug mentioned in #577. That bug is that not all the GPU layers are offloaded. This is fixed by #534.