kserve / kserve

Standardized Serverless ML Inference Platform on Kubernetes
https://kserve.github.io/website/
Apache License 2.0
3.5k stars 1.04k forks source link

model with name <inference service name> does not exist. #3682

Open VikasAbhishek opened 4 months ago

VikasAbhishek commented 4 months ago

/kind bug

What steps did you take and what happened: I ran the inference service on custom xgboost model that I trained and saved in .joblib extension using the pvc storage option, followed the link https://kserve.github.io/website/master/modelserving/storage/pvc/pvc/

I have used the port forward and node port to get the ingress host and port. Inference service and pods are running fine. after running the curl command I am getting this error:

What did you expect to happen: I expected that the curl command will return the predictions.

What's the InferenceService yaml: [To help us debug please run kubectl get isvc $name -n $namespace -o yaml and paste the output]

apiVersion: serving.kserve.io/v1beta1 kind: InferenceService metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"serving.kserve.io/v1beta1","kind":"InferenceService","metadata":{"annotations":{},"name":"xgboost-pvc","namespace":"default"},"spec":{"predictor":{"xgboost":{"storageUri":"pvc://task-pv-claim/model/S1-B2-C1_xgboost.joblib"}}}} creationTimestamp: "2024-05-10T11:16:48Z" finalizers:

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.] when I checked the logs of xgboost predictor pod after running the curl command I am getting this: 2024-05-13 05:16:25.231 1 kserve ERROR [model_not_found_handler():113] Exception: Traceback (most recent call last): File "/prod_venv/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app await app(scope, receive, sender) File "/prod_venv/lib/python3.9/site-packages/starlette/routing.py", line 74, in app response = await func(request) File "/prod_venv/lib/python3.9/site-packages/fastapi/routing.py", line 299, in app raise e File "/prod_venv/lib/python3.9/site-packages/fastapi/routing.py", line 294, in app raw_response = await run_endpoint_function( File "/prod_venv/lib/python3.9/site-packages/fastapi/routing.py", line 191, in run_endpoint_function return await dependant.call(**values) File "/kserve/kserve/protocol/rest/v1_endpoints.py", line 67, in predict model_ready = self.dataplane.model_ready(model_name) File "/kserve/kserve/protocol/dataplane.py", line 213, in model_ready raise ModelNotFound(model_name) kserve.errors.ModelNotFound: Model with name xgboost-pvc does not exist. 2024-05-13 05:16:25.232 uvicorn.access INFO: 10.244.2.5:0 1 - "POST /v1/models/xgboost-pvc%3Apredict HTTP/1.1" 404 Not Found 2024-05-13 05:16:25.233 kserve.trace kserve.io.kserve.protocol.rest.v1_endpoints.predict: 0.0017123222351074219 2024-05-13 05:16:25.233 kserve.trace kserve.io.kserve.protocol.rest.v1_endpoints.predict: 0.0017050000001290755

Environment:

sivanantha321 commented 4 months ago

@VikasAbhishek Can you post the response of http://${Host}:${Port}/v1/models

VikasAbhishek commented 4 months ago

@VikasAbhishek Can you post the response of http://${Host}:${Port}/v1/models

sivanantha321 commented 4 months ago

Have you added Host header ?

VikasAbhishek commented 4 months ago

Classification: Confidential No I run the command curl -v http://${INGRESS_HOST}:${INGRESS_PORT}/v1/modelshttp://$%7bINGRESS_HOST%7d:$%7bINGRESS_PORT%7d/v1/models.

From: Sivanantham @.> Sent: Monday, May 13, 2024 1:16 PM To: kserve/kserve @.> Cc: Vikas Ghunawat Meena @.>; State change @.> Subject: Re: [kserve/kserve] model with name does not exist. (Issue #3682)

[CAUTION: This Email is from outside the Organization. Unless you trust the sender, Don't click links or open attachments as it may be a Phishing email, which can steal your Information and compromise your Computer.]

Have you added Host header ?

- Reply to this email directly, view it on GitHubhttps://github.com/kserve/kserve/issues/3682#issuecomment-2106875252, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BHL3AUU3DZ5X6JVLIHGTPK3ZCBVUHAVCNFSM6AAAAABHTSUO6KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBWHA3TKMRVGI. You are receiving this because you modified the open/close state.Message ID: @.**@.>>

::DISCLAIMER::


The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects.


VikasAbhishek commented 4 months ago

Classification: Confidential After using host header : curl -v -H "Host: ${SERVICE_HOSTNAME}" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/modelshttp://$%7bINGRESS_HOST%7d:$%7bINGRESS_PORT%7d/v1/models * Trying 127.0.0.1:8080...

From: Sivanantham @.> Sent: Monday, May 13, 2024 1:16 PM To: kserve/kserve @.> Cc: Vikas Ghunawat Meena @.>; State change @.> Subject: Re: [kserve/kserve] model with name does not exist. (Issue #3682)

[CAUTION: This Email is from outside the Organization. Unless you trust the sender, Don't click links or open attachments as it may be a Phishing email, which can steal your Information and compromise your Computer.]

Have you added Host header ?

- Reply to this email directly, view it on GitHubhttps://github.com/kserve/kserve/issues/3682#issuecomment-2106875252, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BHL3AUU3DZ5X6JVLIHGTPK3ZCBVUHAVCNFSM6AAAAABHTSUO6KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBWHA3TKMRVGI. You are receiving this because you modified the open/close state.Message ID: @.**@.>>

::DISCLAIMER::


The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects.


sivanantha321 commented 4 months ago

@VikasAbhishek The response is not available in your comment, anyways you can verify if the model is ready, and can view the model name. Try using this as the model name for inference . If the response is empty, this may mean that the model is not loaded. In that case please, verify the model server logs.

VikasAbhishek commented 4 months ago

Classification: Confidential I checked with the curl command that my model is not uploading to the models folder like the sklearn-iris example. curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/modelshttp://$%7bINGRESS_HOST%7d:$%7bINGRESS_PORT%7d/v1/models

curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/modelshttp://$%7bINGRESS_HOST%7d:$%7bINGRESS_PORT%7d/v1/models

Can you guide me how to upload my model in the v1/models for pvc storageUri method.

From: Sivanantham @.> Sent: Monday, May 13, 2024 2:07 PM To: kserve/kserve @.> Cc: Vikas Ghunawat Meena @.>; Mention @.> Subject: Re: [kserve/kserve] model with name does not exist. (Issue #3682)

[CAUTION: This Email is from outside the Organization. Unless you trust the sender, Don't click links or open attachments as it may be a Phishing email, which can steal your Information and compromise your Computer.]

@VikasAbhishekhttps://github.com/VikasAbhishek The response is not available in your comment, anyways you can verify if the model is ready, and can view the model name. Try using this as the model name for inference

- Reply to this email directly, view it on GitHubhttps://github.com/kserve/kserve/issues/3682#issuecomment-2106976565, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BHL3AUWAF27EYW4CDMLBTLLZCB3URAVCNFSM6AAAAABHTSUO6KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBWHE3TMNJWGU. You are receiving this because you were mentioned.Message ID: @.***>

::DISCLAIMER::


The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects.


sivanantha321 commented 4 months ago

@VikasAbhishek As mentioned earlier, the model is not loaded. Please, verify the model server logs and storage initializer logs.

fschlz commented 4 months ago

I have a similar issue using the InferenceService on Azure AKS with a MLflow tracked model.

This is the structure of my model directory:

└── model
    ├── MLmodel
    ├── conda.yaml
    ├── metadata
    │   ├── MLmodel
    │   ├── conda.yaml
    │   ├── python_env.yaml
    │   └── requirements.txt
    ├── model-settings.json
    ├── model.pkl
    ├── python_env.yaml
    └── requirements.txt

I followed the Azure guide here and MLflow guide here but cannot seem to get the InferenceService to deploy correctly.

This is my model.yml

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "wine-classifier"
  namespace: "mlflow-kserve-test"
spec:
  predictor:
    serviceAccountName: sa
    model:
      modelFormat:
        name: mlflow
      protocolVersion: v2
      storageUri: "https://{SA}.blob.core.windows.net/azureml/ExperimentRun/dcid.{RUNID}/model"

I tried using storageUri: "https://{SA}.blob.core.windows.net/azureml/ExperimentRun/dcid.{RUNID}/model/model.pkl", but then the service doesn't start properly because it cannot build the environment.

Here are the log outputs from the kserve-container:

Environment tarball not found at '/mnt/models/environment.tar.gz'
Environment not found at './envs/environment'
2024-06-05 14:36:17,236 [mlserver.parallel] DEBUG - Starting response processing loop...
2024-06-05 14:36:17,238 [mlserver.rest] INFO - HTTP server running on http://0.0.0.0:8080
INFO:     Started server process [1]
INFO:     Waiting for application startup.
2024-06-05 14:36:17,267 [mlserver.metrics] INFO - Metrics server running on http://0.0.0.0:8082
2024-06-05 14:36:17,267 [mlserver.metrics] INFO - Prometheus scraping endpoint can be accessed on http://0.0.0.0:8082/metrics
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
2024-06-05 14:36:18,568 [mlserver.grpc] INFO - gRPC server running on http://0.0.0.0:9000
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
INFO:     Uvicorn running on http://0.0.0.0:8082 (Press CTRL+C to quit)
2024/06/05 14:36:19 WARNING mlflow.pyfunc: Detected one or more mismatches between the model's dependencies and the current Python environment:
- mlflow (current: 2.3.1, required: mlflow==2.12.2)
- cloudpickle (current: 2.2.1, required: cloudpickle==3.0.0)
- numpy (current: 1.23.5, required: numpy==1.24.4)
- packaging (current: 23.1, required: packaging==23.2)
- psutil (current: uninstalled, required: psutil==5.9.8)
- pyyaml (current: 6.0, required: pyyaml==6.0.1)
- scikit-learn (current: 1.2.2, required: scikit-learn==1.3.2)
- scipy (current: 1.9.1, required: scipy==1.10.1)
To fix the mismatches, call `mlflow.pyfunc.get_model_dependencies(model_uri)` to fetch the model's environment and install dependencies using the resulting environment file.
2024-06-05 14:36:19,791 [mlserver] INFO - Couldn't load model 'wine-classifier'. Model will be removed from registry.
2024-06-05 14:36:19,791 [mlserver.parallel] ERROR - An error occurred processing a model update of type 'Load'.
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/worker.py", line 158, in _process_model_update
await self._model_registry.load(model_settings)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 293, in load
return await self._models[model_settings.name].load(model_settings)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 148, in load
await self._load_model(new_model)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 165, in _load_model
model.ready = await model.load()
File "/opt/conda/lib/python3.8/site-packages/mlserver_mlflow/runtime.py", line 155, in load
self._model = mlflow.pyfunc.load_model(model_uri)
File "/opt/conda/lib/python3.8/site-packages/mlflow/pyfunc/__init__.py", line 582, in load_model
model_meta = Model.load(os.path.join(local_path, MLMODEL_FILE_NAME))
File "/opt/conda/lib/python3.8/site-packages/mlflow/models/model.py", line 468, in load
return cls.from_dict(yaml.safe_load(f.read()))
File "/opt/conda/lib/python3.8/site-packages/mlflow/models/model.py", line 478, in from_dict
model_dict["signature"] = ModelSignature.from_dict(model_dict["signature"])
File "/opt/conda/lib/python3.8/site-packages/mlflow/models/signature.py", line 83, in from_dict
inputs = Schema.from_json(signature_dict["inputs"])
File "/opt/conda/lib/python3.8/site-packages/mlflow/types/schema.py", line 360, in from_json
return cls([read_input(x) for x in json.loads(json_str)])
File "/opt/conda/lib/python3.8/site-packages/mlflow/types/schema.py", line 360, in <listcomp>
return cls([read_input(x) for x in json.loads(json_str)])
File "/opt/conda/lib/python3.8/site-packages/mlflow/types/schema.py", line 358, in read_input
return TensorSpec.from_json_dict(**x) if x["type"] == "tensor" else ColSpec(**x)
TypeError: __init__() got an unexpected keyword argument 'required'
2024-06-05 14:36:19,793 [mlserver] INFO - Couldn't load model 'wine-classifier'. Model will be removed from registry.
2024-06-05 14:36:19,795 [mlserver.parallel] ERROR - An error occurred processing a model update of type 'Unload'.
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/worker.py", line 160, in _process_model_update
await self._model_registry.unload_version(
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 302, in unload_version
await model_registry.unload_version(version)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 201, in unload_version
model = await self.get_model(version)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 237, in get_model
raise ModelNotFound(self._name, version)
mlserver.errors.ModelNotFound: Model wine-classifier not found
2024-06-05 14:36:19,796 [mlserver] ERROR - Some of the models failed to load during startup!
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/mlserver/server.py", line 125, in start
await asyncio.gather(
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 293, in load
return await self._models[model_settings.name].load(model_settings)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 148, in load
await self._load_model(new_model)
File "/opt/conda/lib/python3.8/site-packages/mlserver/registry.py", line 161, in _load_model
model = await callback(model)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/registry.py", line 152, in load_model
loaded = await pool.load_model(model)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/pool.py", line 74, in load_model
await self._dispatcher.dispatch_update(load_message)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/dispatcher.py", line 123, in dispatch_update
return await asyncio.gather(
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/dispatcher.py", line 138, in _dispatch_update
return await self._dispatch(worker_update)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/dispatcher.py", line 146, in _dispatch
return await self._wait_response(internal_id)
File "/opt/conda/lib/python3.8/site-packages/mlserver/parallel/dispatcher.py", line 152, in _wait_response
inference_response = await async_response
mlserver.parallel.errors.WorkerError: builtins.TypeError: __init__() got an unexpected keyword argument 'required'
2024-06-05 14:36:19,796 [mlserver.parallel] INFO - Waiting for shutdown of default inference pool...
2024-06-05 14:36:19,997 [mlserver.parallel] INFO - Shutdown of default inference pool complete
2024-06-05 14:36:19,997 [mlserver.grpc] INFO - Waiting for gRPC server shutdown
2024-06-05 14:36:20,001 [mlserver.grpc] INFO - gRPC server shutdown complete
INFO:     Shutting down
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [1]
INFO:     Application shutdown complete.
INFO:     Finished server process [1]

It manages to create the environment, but cannot load the model.

Calling mlflow.pyfunc.load_model(model_uri) locally, loads the model. And testing with mlserver start . was also successful.

Deployments of real-time endpoints of the tracked model in Azure Machine Learning also work, but I need an alternative to deploy on prem.

Any help would be much appreciated.

dr3s commented 2 months ago

I have a similar issue. pretty sure it's due to the dependencies not loading for the model but the error is less than helpful. In my case, I'm using a private pypi so thinking that's not working.

jagveers commented 3 weeks ago

@fschlz I am also facing the same issue. Are you able to figure this out?

This is my log -

Environment tarball not found at '/mnt/models/environment.tar.gz'
Environment not found at './envs/environment'
2024-09-11 20:21:03,209 [mlserver.parallel] DEBUG - Starting response processing loop...
2024-09-11 20:21:03,211 [mlserver.rest] INFO - HTTP server running on http://0.0.0.0:8080
INFO:     Started server process [1]
INFO:     Waiting for application startup.
2024-09-11 20:21:03,238 [mlserver.metrics] INFO - Metrics server running on http://0.0.0.0:8082
2024-09-11 20:21:03,291 [mlserver.metrics] INFO - Prometheus scraping endpoint can be accessed on http://0.0.0.0:8082/metrics
INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
2024-09-11 20:21:05,197 [mlserver.grpc] INFO - gRPC server running on http://0.0.0.0:9000
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
INFO:     Uvicorn running on http://0.0.0.0:8082 (Press CTRL+C to quit)
2024/09/11 20:21:06 WARNING mlflow.pyfunc: Detected one or more mismatches between the model's dependencies and the current Python environment:
 - mlflow (current: 2.3.1, required: mlflow==2.14.2)
 - accelerate (current: uninstalled, required: accelerate==0.32.1)
 - autonomize-model-sdk (current: uninstalled, required: autonomize-model-sdk==0.1.1.dev3)
 - click (current: 8.1.3, required: click==8.1.7)
 - cloudpickle (current: 2.2.1, required: cloudpickle==3.0.0)
 - datasets (current: 2.12.0, required: datasets==2.20.0)
 - dateparser (current: uninstalled, required: dateparser==1.2.0)
 - httpx (current: uninstalled, required: httpx==0.27.2)
 - nlpreprocessing (current: uninstalled, required: nlpreprocessing==1.0.1)
 - numpy (current: 1.23.5, required: numpy==1.26.4)
 - packaging (current: 23.1, required: packaging==24.1)
 - pandas (current: 2.0.1, required: pandas==2.2.2)
 - protobuf (current: 3.20.3, required: protobuf==4.25.4)
 - pyarrow (current: 11.0.0, required: pyarrow==15.0.2)
 - python-dotenv (current: 1.0.0, required: python-dotenv==1.0.1)
 - pytz (current: 2023.3, required: pytz==2024.1)
 - pyyaml (current: 6.0, required: pyyaml==6.0.2)
 - requests (current: 2.28.2, required: requests==2.32.3)
 - scikit-learn (current: 1.2.2, required: scikit-learn==1.5.1)
 - scipy (current: 1.9.1, required: scipy==1.14.1)
 - transformers (current: 4.28.1, required: transformers==4.42.3)
To fix the mismatches, call `mlflow.pyfunc.get_model_dependencies(model_uri)` to fetch the model's environment and install dependencies using the resulting environment file.
2024/09/11 20:21:06 WARNING mlflow.pyfunc: The version of Python that the model was saved in, `Python 3.11.9`, differs from the version of Python that is currently running, `Python 3.8.16`, and may be incompatible
2024/09/11 20:21:06 WARNING mlflow.pyfunc: The version of CloudPickle that was used to save the model, `CloudPickle 3.0.0`, differs from the version of CloudPickle that is currently running, `CloudPickle 2.2.1`, and may be incompatible
2024-09-11 20:21:06,985 [mlserver] INFO - Couldn't load model 'cllm-v2'. Model will be removed from registry.
2024-09-11 20:21:06,985 [mlserver.parallel] ERROR - An error occurred processing a model update of type 'Load'. 

here is my inference -

apiVersion: "serving.kserve.io/v1beta1" kind: "InferenceService" metadata: name: "cllm" namespace: "modelhub" labels: azure.workload.identity/use: "true" spec: predictor: model: modelFormat: name: mlflow protocolVersion: v2 storageUri: "https://autonomizestorageaccount.blob.core.windows.net/mlflow/26/596c868bc1b94b2597750afb425dfbc4/artifacts/cllm_v2"

I am able to run this using -

mlflow.pyfunc.load_model(model_uri)

sivanantha321 commented 3 weeks ago

@jagveers @fschlz This is due the dependency mismatch with your environment and model. This is not related to KServe.

dblane-digicatapult commented 2 weeks ago

I have a similar issue, I was led to believe that Kserve would install the dependencies if we suppled the conda.yaml, python_env.yaml or requirements.txt @sivanantha321 are you saying this is not the case and the only way to do this with kserve is to either a) bundle the deps into an environment.tar.gz using mlflow/conda OR b) supply our own model serving container with the deps already bundled?