Closed sfxworks closed 2 months ago
Current hack, just mount /tmp path of llama rpc and adjust container command
@sfxworks did you tried with passing --
in the args?
E.g. this works here
./local-ai worker llama-cpp-rpc -- -H 1.1.1.1 -p 50052 -m 20
9:49AM INF env file found, loading environment variables from file envFile=.env
9:49AM INF Setting logging to info
create_backend: using CPU backend
Starting RPC server on 1.1.1.1:50052, backend memory: 20 MB
there is a bit of misalignment of how it works without p2p. My bad as I neglected this piece as I usually run it with P2P. I think would be better to be consistent with both commands and then have:
./local-ai worker llama-cpp-rpc --llama-cpp-args="-H 1.1.1.1 -p 50052 -m 20"
I've opened https://github.com/mudler/LocalAI/pull/3428 to make it more consistent. Thanks for pointing it out @sfxworks
Gotcha, and np thank you!
LocalAI version: quay.io/go-skynet/local-ai:latest-aio-gpu-hipblas
Environment, CPU architecture, OS, and Version: k8s,
Describe the bug
When running local-ai worker llama-cpp-rpc, and trying to tell it to listen on all addresses, it fails.
To Reproduce
Expected behavior
Logs Starts completely fine in the pod, just listens on localhost which is useless here.