-
### Describe the bug
kinda like bing deleting its messages, ai deletes its own message with 0 tokens generated. much like #2204
### Is there an existing issue for this?
- [X] I have searched the ex…
-
I'm from yesterday's thread. I'm not a programmer.
I saw your most recent change concerning the Touch button and pulled and still nothing happening when I send a message in ooba webui. There's noth…
-
### System Info
When running on Kubernetes using the latest image and attempting to run through the tutorial with bloom-560m, it looks like PyTorch is unable to detect the GPU.
### Information
- [X…
-
I was trying to do an apples-to-apple shootout on GPTQ vs the new llama.cpp k-quants (memory usage, speed, etc) but ran into a bump with perplexity. It looks like exllama loads a jsonl formatted versi…
-
### Describe the bug
This particular model indicates it should be compatible on the card, with the same version of GPTQ (the branch here instead of the main triton one - I have tried using that too…
-
I tried WebLLM the other week and was really blown away. I have an Intel macOS system with AMD 6900XT GPU and using WebLLM was the first time I'd had decent GPU inference on this system.
Now I'd lo…
-
WHY did you guys end support for older Llama models ? why is backwards compatibility not added when you change formats ? This is what pisses me off about open source, its absolute fraken chaos, things…