Closed IronPan closed 3 years ago
Just working my way through the documentation, thanks for pointing me in that direction. It seems geared around using kfp.Client
to execute pipelines; what's the corresponding vision when executing through the UI? I was hoping that pipelines would execute in a namespace based on what's selected in the top drop-down, is that the idea?
@jackwhelpton Yes, the feature you described is already there. They are not mentioned in the doc just because they work seamlessly.
@Bobgy re minio artifact store not being supported in KF 1.1 release, does that mean that a pipeline running in my namespace still writes to a shared artifact store? For example, anything my pipeline writes implicitly (eg: data written when piping results between steps in a pipeline like consumer_op(producer_task.output)
) is accessible to anyone who can look inside that artifact store?
@ca-scribner That's right. Current suggested workaround is to only pass urls through minio, let components read/write GCS/S3 directly and manage permission there if you care about data separation. (If you use TFX, that's already the case.)
Or I think minio supports multi tenant natively: https://docs.min.io/docs/multi-tenant-minio-deployment-guide.html, we'd welcome contribution how that can be integrated with KFP multi user mode.
@Bobgy ok we lose kfp's helpful automatic piping of real data, but the data is still secure. Only meaningful downside I think is that everyone has to teach their components how to talk to their blob storage rather than offloading it to reusable blob-put/blob-get components. That's a fair compromise.
You're right about minio multi-tenancy (I work in one atm). I'll ask around for ideas.
@ca-scribner I think the Minio "Multi tenant" is slightly different than what we're doing; I think we're using OPA or Istio magic or something to provide every namespace with a private bucket on a single tenant (We do have minimal v.s. premium tenants, but that's different). I think the term "tenant" is a bit overloaded here
@jackwhelpton Yes, the feature you described is already there. They are not mentioned in the doc just because they work seamlessly.
Hi @Bobgy, we're hoping to get more clarification on multi-tenancy and the expected behavior. When you say "seamlessly", does that mean kubeflow will natively assign new experiments to the user's namespace as long as the headers are passed correctly, or do we need to add more components to our pipeline configuration to get the experiments to run under the user's namespace?
The reason I'm asking this is we're currently seeing the following msg in our [ ml-pipeline-scheduledworkflow ] logs:
time="2020-07-21T06:34:19Z" level=info msg="Processing object (inception-v3-transfer-hq5zv): object has no owner." Workflow=inception-v3-transfer-hq5zv
@RoyerRamirez Yes, experiments will be assigned to user's namespace (the namespace you selected in Kubeflow dashboard). Actions will be authorized by user's header.
The reason I'm asking this is we're currently seeing the following msg in our [ ml-pipeline-scheduledworkflow ] logs: time="2020-07-21T06:34:19Z" level=info msg="Processing object (inception-v3-transfer-hq5zv): object has no owner." Workflow=inception-v3-transfer-hq5zv
Can you open a separate issue describing how you deployed and what problems you met?
@Bobgy
A quick list of things we don't support multi user separation in the upcoming KF 1.1 release:
- pipeline resources (the static yaml/tar files you upload)
- minio artifact storage
- MLMD
Any plans for MLMD? Are you talking about aggregation? like we only read artifacts/executions belongs to visible KFP resources from user's namespace? Or native isolation on the MLMD side? I think MLMD schema currently doesn't provide any concept for users?
@Jeffwan Yes, you understandings are correct. So far I'm not aware of any plan for MLMD multi-user separation.
/cc @neuromage @dushyanthsc Is there anything you can share about this?
@Jeffwan @Bobgy Based on the initial documents that the Karl shared as part of the Model Management group, MLMD was going to support a "Project" context, or at least the ability to create such a context. This project context could be tied to the User's Profile and provide the necessary isolation for metadata.
@maganaluis em. Seems it remove context
and bring in project
product
workflow
. Have this proposal reviewed by mlmd team? I feel like this is a big schema change and some projects like TFX need to buy in the proposal which may take some time. At the same time, as a short term solution, we can group artifacts/executions by user's pipeline runs as @Bobgy originally proposed. Currently, I think only KFP use metadata service, so it's kind of safe to do this way.
@maganaluis I think @karlschriek 's doc is just a proposal; so it might change. I think in my discussions with @neuromage we were talking about using labels to group metadata. So "project", "experiment", etc... might just be user defined labels. As such they probably wouldn't be closely tied to multi-user support.
@Jeffwan Yes, you understandings are correct. So far I'm not aware of any plan for MLMD multi-user separation.
/cc @neuromage @dushyanthsc Is there anything you can share about this?
Hi, we have no current plans to add multi-user support directly in MLMD at this point in time. As you point out, there is no support for users in the MLMD schemas right now unfortunately. It would be worth exploring the use-cases for multi-user MLMD to figure out the right approach as well.
KFP multi-user shipped in KF 1.1. I suggest closing this issue and opening up more actionable, scoped issues for further improvements.
/close
@jlewi: Closing this issue.
[April/6/2020] Latest design is in https://docs.google.com/document/d/1R9bj1uI0As6umCTZ2mv_6_tjgFshIKxkSt00QLYjNV4/edit?ts=5e4d8fbb#heading=h.5s8rbufek1ax
Areas we are working on:
Release
Areas related to integration with Kubeflow
=============== original description
Some users express the interest of an isolation between the cluster admin and cluster user - Cluster admin deploy Kubeflow Pipelines as part of Kubeflow in the cluster; Cluster user can use Kubeflow Pipelines functionalities, without being able to access the control plane.
Here are the steps to support this functionality.