issues
search
ngxson
/
wllama
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
https://huggingface.co/spaces/ngxson/wllama
MIT License
444
stars
23
forks
source link
implement KV cache reuse for completion
#101
Closed
ngxson
closed
3 months ago
ngxson
commented
3 months ago
Equivalent to
prompt_cache
option on llama.cpp server
Equivalent to
prompt_cache
option on llama.cpp server