Large Model Proxy is designed to make it easy to run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources. It listens on a dedicated port for each proxied LM, making them always available to the clients connecting to these ports.
GNU General Public License v2.0
46
stars
3
forks
source link
Starting a service starved for resources can happen while proxy is stopping #10
After interrupt signal is received, a service that previously had no resources to start can get started after resources are freed by the interrupt command.
The proxy should instead close connections for services that hasn't yet started
After interrupt signal is received, a service that previously had no resources to start can get started after resources are freed by the interrupt command.
The proxy should instead close connections for services that hasn't yet started