Closed AbhishekBose closed 1 month ago
hi @AbhishekBose, you can read the config file and load the model in setup method.
import litserve as ls
class CustomAPI(ls.LitAPI):
def setup(self, device):
model_path = read_config()
self.model = load_model(model_path)
...
Please let me know if this answers your question?
@aniketmaurya If I have to register a new model at runtime? Is that possible or do I have to restart the server every time I onboard a new model or a new version of the same model? I am mentioning from a context of torchserve where we can register the models and their versions with the number of workers specified through a register model api at runtime.
@AbhishekBose LitServe currently doesn't support updating model without interrupting the runtime.
Trying to understand the use case in a real world production scenario - Usually when you deploy the model, to update it people generally use an orchestrator like Kubernetes or you can serve on Lightning too which takes care of this.
Would be really helpful if you can elaborate how you serve the model?
@aniketmaurya Currently we serve pythonic workflows as a sidecar application to our main ML service platform application. In that regard, it becomes a difficult to redeploy updates every time there's a change. We were thinking if would be possible to register the serving class in some manner at runtime itself. This makes the model deployment experience completely self served for the Data scientists in concern. They can test just the model on their local machine and the push it to the server.
@aniketmaurya it is quite simple in PyTorch to unload model: https://github.com/oobabooga/text-generation-webui/blob/d1af7a41ade7bd3c3a463bfa640725edb818ebaf/modules/models.py#L391
@aniketmaurya Currently we serve pythonic workflows as a sidecar application to our main ML service platform application. In that regard, it becomes a difficult to redeploy updates every time there's a change. We were thinking if would be possible to register the serving class in some manner at runtime itself. This makes the model deployment experience completely self served for the Data scientists in concern. They can test just the model on their local machine and the push it to the server.
@AbhishekBose you can use a callback to detect file change and reload the model.
@cyberluke LitServe is a generic serving framework and it is not limited to a particular ML library like PyTorch.
Hello Team, We want to run litserve on a single machine. The catch being we want to load models from S3. The model file path and the model name needs to be read from a config file. Want to understand if there is a way to enable such a model registry in order to register the model for inference without having to explicitly hardcode the path in the inference class itself?