Objective: allow user to deploy and keep multiple model endpoints alive so that they can easily test and benchmark different models. (as current OpenAI compatible API server deployment will kill the previous deployment)
**
Changes:
add --new_deployment option to serve.py
if this option is enabled, the application name and route prefix will be using name as defined in the deployment config instead of router and /
for OpenAI compatible API deployment, the endpoint URL will become http://localhost:8000/{name}/v1 instead of http://localhost:8000/v1
enable option for deploying without overriding previous deployment
--new_deployment
option toserve.py
name
as defined in the deployment config instead ofrouter
and/
http://localhost:8000/{name}/v1
instead ofhttp://localhost:8000/v1
enable option for deploying without overriding previous deployment