pytorch / serve

Serve, optimize and scale PyTorch models in production
https://pytorch.org/serve/
Apache License 2.0
4.22k stars 863 forks source link

KServe wrapper default configuration is faulty #2956

Closed sgaist closed 9 months ago

sgaist commented 9 months ago

🐛 Describe the bug

The TorchserveModel class is not initialized correctly when called from __main__.py.

The class initializes the protocol property with whatever is passed as parameters.

The issue lies in the handling of the PROTOCOL_VERSION environment variable. The default value will be None if not set and thus the protocol property will be as well. In the base class implementation, if no configuration is passed, the default value is v1.

This become an issue when the answer is generated as the protocol version is checked.

Error logs

File "/path/to/miniconda3/envs/pytorch-serve-dev/lib/python3.9/site-packages/kserve/model.py", line 121, in __call__
    response = (await self.predict(payload, headers)) if inspect.iscoroutinefunction(self.predict) \
  File "/path/to/miniconda3/envs/pytorch-serve-dev/lib/python3.9/site-packages/kserve/model.py", line 297, in predict
    return InferResponse.from_rest(self.name, res) if is_v2(PredictorProtocol(self.protocol)) else res
  File "/path/to/miniconda3/envs/pytorch-serve-dev/lib/python3.9/enum.py", line 384, in __call__
    return cls.__new__(cls, value)
  File "/path/to/miniconda3/envs/pytorch-serve-dev/lib/python3.9/enum.py", line 702, in __new__
    raise ve_exc
ValueError: None is not a valid PredictorProtocol

Installation instructions

The installation procedure does not matter.

The issue manifests itself in both the Docker container and with a source or package installation

Model Packaing

from abc import ABC
from ts.torch_handler.base_handler import BaseHandler

class TestTBHandler(BaseHandler, ABC):

    def __init__(self):
        super().__init__()

    def initialize(self, context):
        self.initialized = True

    def preprocess(self, requests):
        print("preprocess", requests)

        return []

    def inference(self, samples, *args, **kwargs):
        return [{"test": 42}]

    def postprocess(self, data):
        print("postprocess", data)
        return [data]
torch-model-archiver --model-name test --version 1.0 --export-path model-store --handler handler.py -f

config.properties

inference_address=http://127.0.0.1:8080
management_address=http://127.0.0.1:8081
metrics_address=http://127.0.0.1:8082
install_py_dep_per_model=false
number_of_netty_threads=1
job_queue_size=1
service_envelope=kservev2
model_store=./model-store
load_models=test=test.mar
models={\
  "test": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "test.mar",\
        "minWorkers": 1,\
        "maxWorkers": 5,\
        "batchSize": 1,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  }\
}

Versions


Environment headers

Torchserve branch: HEAD

Warning: torchserve not installed .. Warning: torch-model-archiver not installed ..

Python version: 3.9 (64-bit runtime) Python executable: ~/miniconda3/envs/pytorchserve-dev/bin/python

Versions of relevant python libraries: numpy==1.26.4 psutil==5.9.8 requests==2.31.0 requests-oauthlib==1.3.1 wheel==0.41.2 Warning: torch not present .. Warning: torchtext not present .. Warning: torchvision not present .. Warning: torchaudio not present ..

Java Version:

OS: Mac OSX 13.6.4 (arm64) GCC version: N/A Clang version: 15.0.0 (clang-1500.1.0.2.5) CMake version: version 3.28.3

Versions of npm installed packages: **Warning: newman, newman-reporter-html markdown-link-check not installed...

Repro instructions

create environment with serve installed create handler.py from sample code create config.properties from example

torch-model-archiver --model-name test --version 1.0 --export-path model-store --handler handler.py -f
torchserve --start --model-store model-store  --ts-config config.properties --foreground

create environment with kserve installed

pip install 'kserve[storage]'

clone this repo create /mnt/models/config/config.properties with:

inference_address=http://127.0.0.1:9080
management_address=http://127.0.0.1:9081
metrics_address=http://127.0.0.1:9082
grpc_inference_port=7070
grpc_management_port=7071
model_store=file:///
model_snapshot={"name": "startup.cfg","modelCount": 1,"models": {"test": {"1.0": {"defaultVersion": true, "marName": "test.mar", "minWorkers": 1, "maxWorkers": 5, "batchSize": 5, "maxBatchDelay": 200, "responseTimeout": 60}}}}

start kserve wrapper:

python3 kserve_wrapper/__main__.py

create payload data.json:

{
    "inputs": [
        {
            "name": "uuid", 
            "shape": -1,
            "datatype": "BYTES",
            "data": ["fakedata"]
        }
    ]
}

send request

curl -X POST http://127.0.0.1:9080/v1/models/test:predict -T data.json

Possible Solution

There are several:

In the idea of failing early, adding a check for the validity of the value would likely also be worthwhile as PROTOCOL_VERSION might contain an invalid value.

agunapal commented 9 months ago

Hi @sgaist Thanks for the details! We do have this nightly test https://github.com/pytorch/serve/actions/runs/7999859721

If you can add a test to this, then it would easy to understand and expedite your PR

sgaist commented 9 months ago

Hi @agunapal,

Thanks for the pointer !

I ran the tests and while checking how they were going on, I saw that the PROTOCOL_VERSION environment variable was set properly by the KServe controller (that was the next thing I wanted to fix). Looking at the KServe sources, the fix dates back to September last year.

Seeing that, I verified what I have on my current machine and surprisingly, it's an older version, v0.9.0 to be exact, of the KServe controller that has been deployed. I don't know why that version as I have followed their documentation to deploy the whole system.

This means that the current version of KServe does the right thing when creating the deployment from InferenceService hence my patch would not be strictly required in that context. However I still think that it serves its purpose of making the wrapper's behaviour in line with the base class implementation.

While writing the test, I saw some things that could be improved such as the configuration default handling. I am wondering whether such improvements should be part of a follow up patch or integrated in this one ?

agunapal commented 9 months ago

Hi @sgaist The nightly tests that we run are on nightly images. Hence, you don't see the issue I guess. We will be be making a new release soon.

agunapal commented 9 months ago

Hi @sgaist We welcome any improvements to the code. You can send a PR for one of the issues you want to fix, include a test and we can take it from there

sgaist commented 9 months ago

Good, then I'll open a ticket about the improvements I have in mind for the wrapper on top of what I have done in this one with a follow up patch.