Closed giladgd closed 6 months ago
:tada: This PR is included in version 3.0.0-beta.15 :tada:
The release is available on:
v3.0.0-beta.15
Your semantic-release bot :package::rocket:
:tada: This PR is included in version 3.0.0 :tada:
The release is available on:
Your semantic-release bot :package::rocket:
Description of change
gguf
filesinspect gguf
commandinspect measure
commandreadGgufFileInfo
functionLlamaModel
gpuLayers
andcontextSize
. no manual configuration of those options is needed anymore to maximize performanceJinjaTemplateChatWrapper
tokenizer.chat_template
header from thegguf
file when available - use it to find a better specialized chat wrapper or useJinjaTemplateChatWrapper
with it as a fallbackresolveChatWrapper
chat
,complete
,infill
llama.cpp
CUDA flagFixes #133
Pull-Request Checklist
master
branchnpm run format
to apply eslint formattingnpm run test
passes with this changeFixes #0000