mudler / LocalAI

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
https://localai.io
MIT License
21.92k stars 1.68k forks source link

[BUG] Watchdog not killing idle connections #2724

Closed maxi1134 closed 5 days ago

maxi1134 commented 1 week ago

LocalAI version: localai/localai:latest-aio-gpu-nvidia-cuda-12

Environment, CPU architecture, OS, and Version:

Linux machinelearning 6.8.0-35-generic #35-Ubuntu SMP PREEMPT_DYNAMIC Mon May 20 15:51:52 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Describe the bug LocalAI leaves open python and backend processes that are no longer used, often duplicated.

| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      2862      C   /opt/conda/bin/python                        3646MiB |
|    0   N/A  N/A      2922      C   /usr/bin/python                              2108MiB |
|    0   N/A  N/A   2445522      C   python                                       3256MiB |
|    0   N/A  N/A   2447407      C   python                                       3256MiB |
|    0   N/A  N/A   2448084      C   python                                       3256MiB |
|    0   N/A  N/A   2449253      C   .../backend-assets/grpc/llama-cpp-avx2        904MiB |
|    0   N/A  N/A   2451168      C   python                                       3218MiB |
|    0   N/A  N/A   2451598      C   .../backend-assets/grpc/llama-cpp-avx2        254MiB |
|    0   N/A  N/A   2451625      C   ...data/backend-assets/grpc/llama-ggml        266MiB |
|    0   N/A  N/A   2451647      C   ...kend-assets/grpc/llama-cpp-fallback        254MiB |
+-----------------------------------------------------------------------------------------+

Here, only the two first process are not LocalAI related. Meaning 8 Processes are running just for LocalAI

A restart of the LocalAI docker removes them immediately.

To Reproduce

Leave LocalAI running for a while and issue commands.

Expected behavior Processes to be killed properly by the watchdog

Logs


5:58PM INF Success ip=127.0.0.1 latency="29.921µs" method=GET status=200 url=/readyz
5:59PM DBG [WatchDog] Watchdog checks for busy connections
5:59PM DBG [WatchDog] Watchdog checks for idle connections
5:59PM DBG [WatchDog] 127.0.0.1:44591: idle connection
5:59PM DBG [WatchDog] Watchdog checks for busy connections
5:59PM DBG [WatchDog] Watchdog checks for idle connections
5:59PM DBG [WatchDog] 127.0.0.1:44591: idle connection
5:59PM INF Success ip=127.0.0.1 latency="35.641µs" method=GET status=200 url=/readyz
6:00PM DBG [WatchDog] Watchdog checks for busy connections
6:00PM DBG [WatchDog] Watchdog checks for idle connections
6:00PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:00PM DBG [WatchDog] Watchdog checks for busy connections
6:00PM DBG [WatchDog] Watchdog checks for idle connections
6:00PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:00PM INF Success ip=127.0.0.1 latency="30.91µs" method=GET status=200 url=/readyz
6:01PM DBG [WatchDog] Watchdog checks for busy connections
6:01PM DBG [WatchDog] Watchdog checks for idle connections
6:01PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:01PM DBG [WatchDog] Watchdog checks for busy connections
6:01PM DBG [WatchDog] Watchdog checks for idle connections
6:01PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:01PM INF Success ip=127.0.0.1 latency="31.711µs" method=GET status=200 url=/readyz
6:02PM DBG [WatchDog] Watchdog checks for busy connections
6:02PM DBG [WatchDog] Watchdog checks for idle connections
6:02PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:02PM DBG [WatchDog] Watchdog checks for busy connections
6:02PM DBG [WatchDog] Watchdog checks for idle connections
6:02PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:02PM INF Success ip=127.0.0.1 latency="30.87µs" method=GET status=200 url=/readyz
6:03PM DBG [WatchDog] Watchdog checks for busy connections
6:03PM DBG [WatchDog] Watchdog checks for idle connections
6:03PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:03PM DBG [WatchDog] Watchdog checks for busy connections
6:03PM DBG [WatchDog] Watchdog checks for idle connections
6:03PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:03PM INF Success ip=127.0.0.1 latency="32.571µs" method=GET status=200 url=/readyz
6:04PM DBG [WatchDog] Watchdog checks for busy connections
6:04PM DBG [WatchDog] Watchdog checks for idle connections
6:04PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:04PM DBG [WatchDog] Watchdog checks for busy connections
6:04PM DBG [WatchDog] Watchdog checks for idle connections
6:04PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:04PM INF Success ip=127.0.0.1 latency="30.941µs" method=GET status=200 url=/readyz
6:05PM DBG [WatchDog] Watchdog checks for busy connections
6:05PM DBG [WatchDog] Watchdog checks for idle connections
6:05PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:05PM DBG [WatchDog] Watchdog checks for busy connections
6:05PM DBG [WatchDog] Watchdog checks for idle connections
6:05PM DBG [WatchDog] 127.0.0.1:44591: idle connection
6:05PM INF Success ip=127.0.0.1 latency="31.911µs" method=GET status=200 url=/readyz

Additional context

This is my .env:


## Enable/Disable single backend (useful if only one GPU is available)
SINGLE_ACTIVE_BACKEND=true
### Watchdog settings
###
# Enables watchdog to kill backends that are inactive for too much time
LOCALAI_WATCHDOG_IDLE=true
#
# Time in duration format (e.g. 1h30m) after which a backend is considered idle
LOCALAI_WATCHDOG_IDLE_TIMEOUT=30m

# Enables watchdog to kill backends that are busy for too much time
LOCALAI_WATCHDOG_BUSY=true
#
# Time in duration format (e.g. 1h30m) after which a backend is considered busy
#
LOCALAI_WATCHDOG_BUSY_TIMEOUT=30m
a17t commented 1 week ago

Possibly related to and fixed by #2720

maxi1134 commented 5 days ago

Seems to work!