Closed felladrin closed 4 months ago
T5 and Flan-T5 models support has just been merged into llama.cpp 🎉
I tried updating llama.cpp in a local copy of wllama, compiling it and loading https://huggingface.co/Felladrin/gguf-LaMini-Flan-T5-248M/resolve/main/LaMini-Flan-T5-248M.Q6_K.gguf. It loads fine, but when I click to run it throws an error:
I also tried editing the example/index.html directly so it loads from HF URL, but it had the same problem.
Sharing it here in case someone has any ideas.
Currently wllama only use llama_decode, but T5 introduces a new API llama_encode which we haven't yet implemented (because T5 is encoder-decoder architecture). This will be added in the next version.
llama_decode
llama_encode
T5 and Flan-T5 models support has just been merged into llama.cpp 🎉
I tried updating llama.cpp in a local copy of wllama, compiling it and loading https://huggingface.co/Felladrin/gguf-LaMini-Flan-T5-248M/resolve/main/LaMini-Flan-T5-248M.Q6_K.gguf. It loads fine, but when I click to run it throws an error:
I also tried editing the example/index.html directly so it loads from HF URL, but it had the same problem.
Sharing it here in case someone has any ideas.