kserve / modelmesh-serving

Controller for ModelMesh
Apache License 2.0
189 stars 106 forks source link

The Python-Based Custom Runtime with MLServer can not support to deploy a model stored on a Persistent Volume Claim #494

Closed zhlsunshine closed 3 months ago

zhlsunshine commented 3 months ago

Describe the bug I store the trained model (I use mnist-svm.joblib in my case) on a PVC, and I have some extra logics to handle the trained model after it is loaded. Therefore, I need to write a custom ServingRuntime to handle it.

To Reproduce It works well when I follow the doc: Deploy a model stored on a Persistent Volume Claim, I can see the model file mnist-svm.joblib and model-settings.json under the folder of /models/_mlserver_models/, showing as below: image

However, I want to write a custom ServingRuntime, so I go on to follow the doc: Python-Based Custom Runtime with MLServer, then create a new ServingRuntime and create an InferenceService for this ServingRuntime. After all these actions, everything is okay, except the inference service can not be True due to "NOT_FOUND" error in inference service, showing below: image

Based on the comparison, I think there should be something wrong with the Python-Based Custom Runtime with MLServer when using Persistent Volume Claim to store a trained model.

Expected behavior

Hope that there is an explicit demo to show how to support the use case of python-based custom runtime with MLServer with a model stored on a Persistent Volume Claim. Thanks a lot if it's possible!