Closed leighklotz closed 2 months ago
https://github.com/Mozilla-Ocho/llamafile/blob/c39d30c5306432eedebf58bdee6424152a613674/llamafile/cuda.c#L912 https://github.com/Mozilla-Ocho/llamafile/blob/c39d30c5306432eedebf58bdee6424152a613674/llamafile/cuda.c#L913
$ echo 2+3= | /path/to/llamafile -m /path/to/Mistral-7B-Instruct-v0.3-Q6_K.gguf --cli --gpu nvidia -ngl 33 -c 4096 --repeat-penalty 1 -t 10 -f /dev/stdin --silent-prompt --no-display-prompt --log-disable --seed -1 FLAG_nocompile 0 FLAG_recompile 0 5. ...
This was fixed before the 0.8.13 release went out. Thanks for the report.
https://github.com/Mozilla-Ocho/llamafile/blob/c39d30c5306432eedebf58bdee6424152a613674/llamafile/cuda.c#L912 https://github.com/Mozilla-Ocho/llamafile/blob/c39d30c5306432eedebf58bdee6424152a613674/llamafile/cuda.c#L913