-
Any tips on running this with Javascript inside a web browser?
As an experiment I tried running the GGUF version with [Wllama](https://github.com/ngxson/wllama), and `I like big books` resulted in …
-
Wllama is a browser-based version of Llama.cpp with low-level capabilities, and has a built-in embedding option too.
https://github.com/ngxson/wllama
While WebLLM only runs on WebGPU-enabled bro…
-
@ngxson @felladrin
I just wanted to quickly say thank you to both of you for your amazing work, support, and even the amazing upstream fixes, like the Phi3 one in Llama.cpp. Because it's finally r…
-
Please create the following browser wasm demos-
1) Stable diffusion with W8A8 quantization- This is important because the stable diffusion [demo](https://intel.github.io/web-ai-showcase/) which I sa…
-
Whenever I try to load it, it crashes Chrome.
This is on a Pixel 6a with 6Gb of RAM.
- Context is set to 1K.
- 16 bit WebGPU is available.
- Using latest version of WebLLM from CDN.
To make…
-
Hi,
The most recent approach of llama.cpp handling the LoRA without merging the weights and allowing for hot-swappable adapters sounds very interesting for in-browser use cases. Are there any plan…
OKUA1 updated
4 weeks ago
-
Noticed this error loading the Llama 1B and 3B models.
I'm updating Wllama now, hopefully that fixes it.
-
I verify that the exit function exists, but calling it results in the error above.
```
try{
if(window.llama_cpp_model_being_loaded){
if(typeof window.llama_cpp_app.unloadModel === 'functi…
-
Hi, I basically copied and edited (only to change path) the examples/basic on https://wllama-basic-example.glitch.me for my own needs (to tinker with WebXR) and wondering if it could facilitate discov…
-
I would like to get a models list from what is in the cache, to implement an "available local models" feature.
I saw that there is a `list` function in the cache manager but it is not very convenie…
synw updated
2 months ago