Open j4ys0n opened 3 weeks ago
i'm using docker compose, here's the config. https://github.com/j4ys0n/local-ai-stack
it is related to edgevpn, it somehow see peer's and addresses but cannot connect with them.
I has tried NAT traversal using libp2p+pubsub and I have managed to make peer discovery and establish p2p connection by randez-vous point.
In your case, if you know your worker address you can just put worker address into ENV of local-ai as gRPC external backend addresses.
LocalAI version:
localai/localai:latest-gpu-nvidia-cuda-12 LocalAI version: v2.22.1 (015835dba2854572d50e167b7cade05af41ed214)
Environment, CPU architecture, OS, and Version:
Linux localai3 6.8.12-2-pve #1 SMP PREEMPT_DYNAMIC PMX 6.8.12-2 (2024-09-05T10:03Z) x86_64 GNU/Linux (Proxmox LXC, Debian. AMD EPYC 7302P (16 cores allocated)/64GB RAM
Describe the bug
When testing distributed inferencing, i select a model (qwen 2.5 14b), send a chat message, the model loads on both instances (main and worker) and then the model does not respond and the model unloads on the worker. (watching with nvitop)
To Reproduce
description above should reproduce, i tried a few times.
Expected behavior
model should not unload & chat should complete
Logs
worker logs
main logs
Additional context
this worked in the last version, though i'm not sure what that was at this point (~2 weeks ago) model loads and works fine without the worker.