kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.63k stars 1.63k forks source link

[feature] Abstract the Artifact Storage API via API server #10872

Open HumairAK opened 5 months ago

HumairAK commented 5 months ago

Feature Area

/area frontend /area backend

What feature would you like to see?

Currently the KFP Frontend server directly reads from the configured Artifact Storage when visualizing Artifacts. Instead we should abstract artifact storage entirely behind the KFP API server, provide a separate artifact storage api for the Front End, as well as any other clients looking to fetch the same information via the API server directly. We should remove any logic that requires the front end to store credentials for object storage and isolate these to the API server only.

This is how it works today:

image

Notice that, the backend writes the the object store, but the front end reads from it. This requires both the backend and frontend to require access to the object store. There is little reason for this, when we can abstract this entirely behind the backend api server and keep these credentials and artifact store access a pure backend concern.

In an ideal future, the frontend client would not have to query mlmd either, but there's a separate issue for that.

We should instead architect this to be something like the following:

image

What is the use case or pain point?

We see this key api feature as a backend concern that has leaked into the frontend. We also see this as a security concern. There seems to be little reason why the front end should have direct access to the artifact storage.

Furthermore, this forces any other potential client for api server to now also have to have direct access to object storage if they wish to view artifacts, which is breaking KFP abstractions, should a user really have to know where/how in object storage their artifact is being stored?

Is there a workaround currently?

Rely on frontend storage to continue to provide this functionality.


Love this idea? Give it a 👍.

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

HumairAK commented 3 months ago

/remove-lifecycle stale

github-actions[bot] commented 1 month ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

HumairAK commented 1 month ago

/remove-lifecycle stale