Support multiple inference loggers

ruivieira commented 1 month ago

/kind feature

Describe the solution you'd like At the moment, Inference Services can specify a single Inference Logger with

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sklearn-iris
spec:
  predictor:
    logger:
      mode: all
      url: ...

In certain scenarios it is useful to specify multiple logging destinations directly from the Inference Service, without any additional component. For instance to support multiple logging backends, such as analytics, data post-processing or security audits.

A proposed solution would be to introduce a new (optional) additionalLoggers field/array, such that

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sklearn-iris
spec:
  predictor:
    logger:
      - mode: ...
        url: ...
    additionalLoggers:
      - mode: all
        url: ...
      - mode: ...
        url: ...

is possible. Adding a new (optional) field would keep compatibility with the current InferenceService API.

Anything else you would like to add: As a concrete example, TrustyAI uses the Inference Logger mechanism to aggregate inference information an calculate metrics. However if a logger is already present for other purposes, it would need to be replaced by TrustyAI's own logging endpoint.

cc @terrytangyuan @danielezonca

Update 5th August 2024

Changed "replacing logger" with "adding a new optional field".

terrytangyuan commented 1 month ago

From the meeting:

This is a nice-to-have.
Other solutions exist but introduce dependencies, e.g. Kafka and Istio traffic mirroring
Introduce as an additional optional field for loggers

ruivieira commented 1 month ago

Updated proposal.

danielezonca commented 1 month ago

I see some challenges with this proposal because it will be part of KServe API with additional logic in the critical path (aka during inference execution): delivery semantic should be "at least once" but this usually requires retry/stateful management.

My suggestion is try to preserve as much as possible a "single write principle" and leverage additional component to do this "multiplier". For example Knative eventing Broker (or even Channel) should work OOTB and it can even backed by persistent storage like Kafka. We can have both with KServe creating automatically the Broker etc but it is a lot of work and customer might want to tune the configuration so even this path has drawbacks.

kserve / kserve

Support multiple inference loggers #3836