Closed wwulfric closed 3 months ago
@wwulfric I did not add a shared state feature (through some external storage) because I did not need it yet, but I am all for it if there is a demand.
Do you ask because you need high availability or because you exhausted all the server resources (i.e., the traffic is so high that a single Paddler instance cannot handle it)?
If you need high availability, I would approach that similarly to how you would deploy HAProxy (an additional standby Paddler instance that starts to accept traffic if a primary Paddler instance fails).
If traffic volume is the issue, then shared storage might indeed be the solution, and I will add it.
I also see another alternative option for stacking Paddler instances. They all have a /health
endpoint compatible with llama.cpp
, so you can set a Paddler agent that observes a Paddler instance's /health
endpoint instead of llama.cpp
. For example:
./paddler agent \
# point that to a child paddler balancer reverse proxy
--external-llamacpp-host 127.0.0.1 \
--external-llamacpp-port 8088 \
# point this to child paddler balancer balancer health endpoint
--local-llamacpp-host 127.0.0.1 \
--local-llamacpp-port 8088 \
# point this to a parent paddler
--management-host 127.0.0.1 \
--management-port 8085
For example, that way you can have three Paddler instances (you can also combine them with HA standby instances). The first and second instances can manage half of your llama.cpp
fleet. Your third instance can manage those two child Paddler instances. That will limit the number of reports each Paddler instance has to accept from their agents. It will add another hop in the infrastructure, though.
@mcharytoniuk Thank you. I am just concerned about the high availability issue.
@mcharytoniuk Thank you. I am just concerned about the high availability issue.
No problem. In that case, I'd recommend using keepalived in front of Paddler, combined with a redundant Paddler instance on standby.
The llama-server stat is stored in balancer memory. It seems not work for multiple balancers