pytorch / serve

Serve, optimize and scale PyTorch models in production
https://pytorch.org/serve/
Apache License 2.0
4.04k stars 820 forks source link

gRPC Model Metadata using Open Inference Protocol #3045

Open harshita-meena opened 3 months ago

harshita-meena commented 3 months ago

🐛 Describe the bug

Consider a system where a feature service fetches model metadata that has information on what feature to fetch and finally infer from the model. In order for me fetch this metadata regarding inputs and outputs I am trying to use the recently added Open inference protocol. while trying to infer using grpcurl, it shows me the name and version of the model.

 grpcurl -plaintext -d  '{"name": "toy-ranker"}' -proto serve/frontend/server/src/main/resources/proto/open_inference_grpc.proto  localhost:79 org.pytorch.serve.grpc.openinference.GRPCInferenceService/ModelMetadata
{
  "name": "toy-ranker",
  "versions": [
    "2024-03-26-15:33"
  ]
}

with simple curl, the output is REST API does not add anything model custom to it.

$ curl http://localhost:80/v2
{
  "name": "Torchserve",
  "version": "0.10.0",
  "extenstion": [
    "kserve",
    "kubeflow"
  ]
}

I was trying to understand where it sets this metadata so i can impute it accordingly. I could not find a way for it to set inputs and outputs.

Do you know of how the metadata is set if so in torchserve.

Error logs

n/a

Installation instructions

Dockerfile on top of latest torchserve image

from pytorch/torchserve-nightly:latest-gpu
ENV TS_OPEN_INFERENCE_PROTOCOL oip

Model Packaing

mnist model can be used, independent of model type.

config.properties

inference_address=http://0.0.0.0:8080 management_address=http://0.0.0.0:8081 metrics_address=http://0.0.0.0:8082 enable_metrics_api=true model_metrics_auto_detect=true metrics_mode=prometheus number_of_netty_threads=32 job_queue_size=1000 enable_envvars_config=true model_store=/home/model-server/model-store workflow_store=/home/model-server/wf-store load_models=all

Versions


Environment headers

Torchserve branch:

Warning: torchserve not installed .. Warning: torch-model-archiver not installed ..

Python version: 3.11 (64-bit runtime) Python executable: /home/hmeena/.pyenv/versions/airflow/bin/python

Versions of relevant python libraries: requests==2.31.0 Warning: torch not present .. Warning: torchtext not present .. Warning: torchvision not present .. Warning: torchaudio not present ..

Java Version:

OS: CentOS Linux release 7.5.1804 (Core) GCC version: (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39) Clang version: 3.4.2 (tags/RELEASE_34/dot2-final) CMake version: N/A

Environment: librarypath (LD/DYLD_): :/search/dist/bin:/search/dist/bin

Repro instructions

Model from old issue i created can be used.

Possible Solution

Take an input metadata file that can be exposed on both gRPC and REST metadata endpoints. One example is on the lines of seldon metadata ep that exposes this information.

harshita-meena commented 3 months ago

Saw example for Describe custom metadata but this seems to be independent of OIPPREDICT ModelMetadataResponse protocol.