Model Registry proposal (ref KF community meeting 20240102)

tarilabs commented 11 months ago

Following feedback received during KF community meeting held 20240102, raising the Model Registry proposal google doc previously shared with the community: (link), as a Markdown in the form of Pull Request (this PR).

Can I recommend you open another PR in the model registry and can we collaborate on a proposal on how you see the integration working, we have verified that we could store and retrieve the models without any issues but we have not explored how/if we should spread the metadata and query it back (if that is needed at all is another question as it can be stored in the db) and also how we can influence the consumption of the model for inference directly from OCI repo as it in projects like Kserve.

rchincha commented 8 months ago

Can I recommend you open another PR in the model registry and can we collaborate on a proposal

Would love to. Also folks over at CNCF artifacts, ORAS and OCI would certainly be interested.

Initial grok'ing of kserve project indicates that there could be a couple of ways to do this:

an "initContainer" approach that pulls required artifacts and lays them out so it can be consumed
A CSI approach like so: https://github.com/converged-computing/oras-csi https://kserve.github.io/website/0.8/modelserving/storage/pvc/pvc/#create-pv-and-pvc

rchincha commented 8 months ago

https://github.com/kubeflow/model-registry/pull/48 ^ fyi, thanks.

rareddy commented 8 months ago

Also folks over at CNCF artifacts, ORAS and OCI would certainly be interested.

@rchincha we are collaborating with ORAS maintainers and model-car initiative inventors let's see we can bring their attention on this effort for storage. We already put in some work towards KServe Storage Containers which will be another way for providing the models for inferencing.

A couple of requests for proposal,

we need to be able to support multiple storage backends as S3 is predominately the most preferred method currently to be used by the AI communities.
For the OCI plugin it must be based OCI-Dist level so that users can have a choice of their Zot, Harbor or Quay etc.
must be able to deploy in Kube, as a lot of users want to able to deploy all infra on their cloud not necessarily always connect to an external SaaS offering.

I did look at ArtifactHub project a couple of months ago in CNCF which looked very interesting in terms of how they use OCI and metadata scraping but did not draw any conclusions about how that could be folded into the mix to bridge the metadata portion or not. That could be very interesting IMO. Is this CNCF project u mentioned above?

rchincha commented 8 months ago

Also folks over at CNCF artifacts, ORAS and OCI would certainly be interested.

@rchincha we are collaborating with ORAS maintainers and model-car initiative inventors let's see we can bring their attention on this effort for storage. We already put in some work towards KServe Storage Containers which will be another way for providing the models for inferencing.

wrt kserve, maybe this as a contract? https://github.com/kserve/kserve/pull/3539

A couple of requests for proposal,

* we need to be able to support multiple storage backends as S3 is predominately the most preferred method currently to be used by the AI communities.

This is best left to the registry implementations which may or may not choose to support S3 backend (for example, speaking only for zot, it does support S3), but make it clear that to be compatible with kubeflow, this is an additional requirement.

* For the OCI plugin it must be based OCI-Dist level so that users can have a choice of their Zot, Harbor or Quay etc.

The OCI plugin must be registry-agnostic of course and this calls out the role that OCI dist-spec v1.1.0 plays as a contract.

* must be able to deploy in Kube, as a lot of users want to able to deploy all infra on their cloud not necessarily always connect to an external SaaS offering.

Another additional requirement, and comes with the territory.

I did look at ArtifactHub project a couple of months ago in CNCF which looked very interesting in terms of how they use OCI and metadata scraping but did not draw any conclusions about how that could be folded into the mix to bridge the metadata portion or not. That could be very interesting IMO. Is this CNCF project u mentioned above?

As I understand it, ArtifactHub predates OCI dist-spec v1.1.0 but there may be interest to standardize on this dist-spec.

metadata scraping

OCI dist-spec v1.1.0 has explicit provisions for this. But can you kindly point to some concrete examples.

Will update https://github.com/kubeflow/model-registry/pull/48

tarilabs commented 8 months ago

metadata scraping

OCI dist-spec v1.1.0 has explicit provisions for this. But can you kindly point to some concrete examples.

Personally very curious for examples on this topic! :) that is very interesting in the context of potentially indexing/query for Manifest of metadata (a "model registry" use case) by means of OCI Artifact.

rchincha commented 8 months ago

metadata scraping

OCI dist-spec v1.1.0 has explicit provisions for this. But can you kindly point to some concrete examples.

Personally very curious for examples on this topic! :) that is very interesting in the context of potentially indexing/query for Manifest of metadata (a "model registry" use case) by means of OCI Artifact.

https://github.com/opencontainers/opencontainers.org/blob/395bc5f98777a72082bfe300a167b563af234ef0/content/posts/blog/2024-03-13-image-and-distribution-1-1.md#describing-associations

^ this is how the OCI community has addressed this. Note that the original use case was container images and associated metadata such as SBOMs etc.

So in this case ...

upload model data (of a particular media-type)
upload model metadata (of a particular media-type and subject:=1. above)
download 1.
download "artifacts referring to 1." and optionally "of a particular media-type"

tarilabs commented 8 months ago

Thanks @rchincha , is there a way to avoid having to download the associated metadata, only to query for it locally, and do that "on the OCI registry" server end?

Example Here I have 3 different ML models stored as OCI artifacts: https://quay.io/repository/mmortari/mnist?tab=tags

I know some metadata for each of those. I'm looking for a solution if possible which doesn't require me to download the associated metadata-Manifest of each of the artifacts locally, in order to query those metadata. For concrete example, if each of the model defines accuracy=0.987 or the likes, I want to query which ML artifacts in mmortari/mnist repo above have max(accuracy)

Hope the example convey the question I'm curious for. Edit: that is why @rareddy was referring to analogous of ArtifactHub, as it would seem from capability and use pov, very similar use-case, in a way.

rchincha commented 8 months ago

@tarilabs

For concrete example, if each of the model defines accuracy=0.987 or the likes, I want to query which ML artifacts in mmortari/mnist repo above have max(accuracy)

In the OCI dist-spec world, one way would be to list all tags in a repository, get their manifests and compare annotations (== accuracy=0.987) - no need to download actual data.

I was more concerned about the following: https://github.com/MarquezProject/marquez https://github.com/google/ml-metadata

tarilabs commented 8 months ago

Thanks @rchincha , reassuring to hear it doesn't need to download actual data, will be looking for a chance to understand in more details from you how OCI dist-spec works for this use-case in practice.

We have Model Registry biweekly meetings: https://www.kubeflow.org/docs/about/community/#kubeflow-community-calendars

Do you think you'll be able to join one, so we could discuss it live in more details? Thanks!

rchincha commented 8 months ago

https://kccnceu2024.sched.com/event/1YeLi ^ This idea is spreading around I suppose ... @Kubecon EU 2024

Your next meeting is Apr 1. Will try to make that.

rchincha commented 6 months ago

https://github.com/kubernetes/enhancements/pull/4642 some overlapping work/groups ...

tarilabs commented 6 months ago

kubernetes/enhancements#4642 some overlapping work/groups ...

iiuc this would allow "materializing" OCI artifacts as a mounted volume in a container, effectively allowing the "files" inside an OCI artifacts to be available for inference say in a running container of a model server. is this a fair summary?

rchincha commented 6 months ago

kubernetes/enhancements#4642 some overlapping work/groups ...

iiuc this would allow "materializing" OCI artifacts as a mounted volume in a container, effectively allowing the "files" inside an OCI artifacts to be available for inference say in a running container of a model server. is this a fair summary?

Still a preliminary KEP, but would seem so.

rhuss commented 6 months ago

kubernetes/enhancements#4642 some overlapping work/groups ...

iiuc this would allow "materializing" OCI artifacts as a mounted volume in a container, effectively allowing the "files" inside an OCI artifacts to be available for inference say in a running container of a model server. is this a fair summary?

For reference, in KServe a workaround for directly accessing files within an OCI image is implemented and available via a sidecar approach ("modelcar") by leveraging root FS system access via the /proc filesystem when shareProcessNamespace: true is set on the Pod. You can find details in the KServe documentation and in the Design Document. It actually implements the desired behavior with current means, but of course is more or less just a workaround of an OCI volume type (as discussed already a long time ago in https://github.com/kubernetes/kubernetes/issues/831)

tarilabs commented 6 months ago

For reference, in KServe a workaround for directly accessing files within an OCI image is implemented and available via a sidecar approach ("modelcar") by leveraging root FS system access via the /proc filesystem when shareProcessNamespace: true is set on the Pod. You can find details in the KServe documentation and in the Design Document. It actually implements the desired behavior with current means, but of course is more or less just a workaround of an OCI volume type (as discussed already a long time ago in kubernetes/kubernetes#831)

thank you @rhuss , to me is about providing user-choice; given an opportunity to have OCI Artifact with a ML model asset:

could build "around" a runnable container image to serve it, say a linux + serving runtime + the ML model asset from the OCI artifact (this is possible today)
could build "around" a ModelCar, to serve it on KServe (this is possible today thanks to your contrib in KServe)
could eventually just mount it in a serving runtime running on k8s leveraging KEP-4639 (in the future)

wdyt?

rchincha commented 3 months ago

https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/ ^ fyi

tarilabs commented 3 months ago

Thank you @rchincha , we indeed noted that blog post as well :)

Fyi, we have it in our live-roadmap as a proposal for integration as a preferred storage solution for the ML model, to complement current Model Registry.

Orthogonal research work in this area, is captured here.

kubeflow / community

Model Registry proposal (ref KF community meeting 20240102) #682

See also