ILikeAI / AlwaysReddy

AlwaysReddy is a LLM voice assistant that is always just a hotkey away.
MIT License
621 stars 61 forks source link

Add TabbyAPI support #79

Closed Jobus0 closed 5 hours ago

Jobus0 commented 1 day ago

TabbyAPI is a local LLM server that, as opposed to Ollama and LM Studio, uses the ExLlamaV2 inference backend instead of Llama.cpp, which has a much faster TTFT (time to first token) and slightly higher TPS (tokens per second), but at the cost of only supporting GPU inference.

It is a good choice for people with decent GPUs that want to minimize response latency.