kserve / modelmesh-serving

Controller for ModelMesh
Apache License 2.0
204 stars 114 forks source link

Support out of distribution detection metrics #334

Open taneem-ibrahim opened 1 year ago

taneem-ibrahim commented 1 year ago

If an OOD enabled model is deployed, model mesh metrics should capture the two additional metrics that these models generate as part of the inferencing metrics.

mudhakar commented 1 year ago

OOD enabled model will produce in a single output tensor:

  1. Original model inferencing output
  2. OOD score

We would need output transformation to separate 1 from 2 and logging to record the input/output and OOD scores (e.g., in OpenShift logging and/or Prometheus). These are generic functionalities that should be useful for things beyond OOD.

spacew commented 1 year ago

Hi, @mudhakar @taneem-ibrahim , just to add more details on OOD (model certainty) enablement and deployment.

Screenshot 2023-03-07 at 6 36 55 PM
nirmdesai commented 1 year ago

Per a discussion with @njhill and @ckadner, the best path forward is to have an output transformer (similar to post-processing transformer in k-serve) native to model-mesh, without requiring k-serve controller.

taneem-ibrahim commented 1 year ago

@nirmdesai @mudhakar After further discussion with @njhill and @ckadner , sounds like our fastest way to get integrated would be to add a custom post processor as part of OOD for now until we have kserve-raw or serverless available in ODH.

daw3rd commented 1 year ago

A proposal for the post-processing transform

image

nirmdesai commented 1 year ago

Thanks @daw3rd. @njhill , @taneem-ibrahim , @ckadner: The above "KServe Proxy" is the custom post-processor container you proposed last week. Could you please review and confirm this is what you had in mind? cc: @mudhakar

taneem-ibrahim commented 1 year ago

Hi @nirmdesai Is the kserve proxy (rest server) here replicating functionality similar to this?

nirmdesai commented 1 year ago

@taneem-ibrahim: Just to be precise, we are not going to use K-Serve transformer framework (shown in the link you shared) in implementing K-Serve Proxy. However, the implementation of our K-Serve Proxy will look similar to a typical pre-/post processor function shown in the example above. Also, the deployment flow will be different from the link you shared wherein the transformer is deployed along with InferenceService creation. In our case, you would first create an InferenceService as you would normally, and on top of that deploy the proxy container. Then you would use the Proxy APIs for inferences instead of using the InferenceService APIs for inferencing. cc: @mudhakar , @daw3rd , @spacew

swith005 commented 1 year ago

Hello @taneem-ibrahim @nirmdesai @mudhakar @spacew @daw3rd cc: @njhill @ckadner

In regards to a proxy service for transforming model output for a certainty-enabled model, below is a diagram demonstrating the interaction for a modelmesh proxy server deployed on the openshift to the same cluster as where RHODS is hosted. Note that in the deployment, we also deploy a Prometheus service for logging the model-certainty metrics overtime as generated by the modelmesh proxy service - both are packaged via helm install, however, if a Prometheus instance already exists, this can be removed.

Please share feedback or comments on the deployment and sequence steps, as well as the endpoint for reaching the modelmesh proxy.

KServeProxy-revised

RobGeada commented 1 year ago

Copying discussions I've had on Slack:

I think TrustyAI can provide a lot of the capabilities that the modelmesh-proxy is aiming to provide, which would provide the advantage of not needing to add another component into the mix

TrustyAI within ODH/RHODS is a service that intercepts modelmesh inputs and output payloads and then sends metrics computed on that input/output data (e.g., fairness metrics) to Prometheus. If we defined a metric that simply grabbed the certainty scores from the model output payload and emitted them to Prometheus as a metric, it'd be a really simple way of doing what you're trying to do.

As a PoC, I've done exactly that and got an OOD model deployed in modelmesh and sending the OOD metrics to Prometheus within OpenDataHub:

image