opendatahub-io / architecture-decision-records

Collection of Architectural Decision Records
Apache License 2.0
16 stars 36 forks source link

ADR-0005: Introduce a controller for Data Science Projects #25

Closed VaishnaviHire closed 4 months ago

VaishnaviHire commented 10 months ago
VaishnaviHire commented 10 months ago

cc: @zdtsw @LaVLaS @andrewballantyne

zdtsw commented 10 months ago

does this mean, we will move the logic out of dashboard and into the new namespace controller instead?

bartoszmajsak commented 10 months ago

Interestingly enough I was talking with @aslakknutsen about moving some bits of https://github.com/maistra/odh-project-controller/ to the opendatahub-operator.

We created this controller to support featured-flagged service mesh in DSPs, so it takes care of

I am happy to port this code instead of maintaining it as a separate one, but I wonder if we are on the same page with the scope here.

bartoszmajsak commented 10 months ago

This change is essential to include cert bundles in every Data Science Project

Is the introduction of something like https://cert-manager.io/ on the roadmap?

VaishnaviHire commented 10 months ago

https://cert-manager.io/

Not yet, https://issues.redhat.com/browse/RHOAIENG-1926 describes the scope of the feature.

Regarding the project controller -

Interestingly enough I was talking with @aslakknutsen about moving some bits of https://github.com/maistra/odh-project-controller/ to the opendatahub-operator.

We created this controller to support featured-flagged service mesh in DSPs, so it takes care of

  • enrolling namespace created through dashboard to service mesh
  • add authz rules
  • exposes certain settings through configmaps so they can be reused across other controllers

I am happy to port this code instead of maintaining it as a separate one, but I wonder if we are on the same page with the scope here.

Yes I see this as one of the usecase for the controller. For initial implementation, idea is to be able to identify which namespaces are set as Data Science Projects and add custom configmap to it. We can extend the controller to include above configuration for Servicemesh.

VaishnaviHire commented 10 months ago

does this mean, we will move the logic out of dashboard and into the new namespace controller instead?

Eventually, but the immediate goal for this controller would be ensuring we are able to identify namespaces set as Data Science Project and are able to manage and deploy resources to these namespaces

LaVLaS commented 10 months ago

There is very little detail in the why here. Especially for what impact it has on the end users.

The goal is not to impact the user by extending the lifecycle support to ODH infrastructure deployed in DS-Projects

When I create a new project today, I'm not aware of any secrets or config maps that need to be created for a project to function correctly, so what is this controller going to solve?

This is the reason for controller because right now the Dashboard application is responsible for creating any required secrets or configmaps for a DS project to function correctly

Another consideration, is that in the dashboard now allows you to work in projects that do not have the ODH label to make them a data science project. How will those projects be handled?

There is no intent to restrict the view to DS-Projects with ODH labels but if ODH has a presence in those DS Projects we need a mechanism to watch and manage any core ODH infrastructure in those projects

My suspicion is that whatever you want to do should probably be handled by the existing controllers. For example, if you want to setup a secret or config map for Data Science Pipelines, that secret or config map should really just be created by the DSPA instead.

Agreed. This does not replace the setup that is owned by an individual component

If you want to create something that is shared by multiple controllers, we need to better understand the dependencies between those controllers. For example, if the model registry is going to be dependent on DSPA, the model registry should be able to look at the DSPA and get the info it needs from it or from objects the DSPA manages. We should not create a new controller that both are now dependent on.

This is a good point but this DS-Project would only be responsible for a shared dependency when ODH infrastructure is the owner of the dependency, like a shared Database or global ODH configuration.

strangiato commented 9 months ago

There is lots of great comments and detail in this discussion thread about what the goals of introducing a new controller are and I hope these make it back into the ADR.

This is the reason for controller because right now the Dashboard application is responsible for creating any required secrets or configmaps for a DS project to function correctly

What secrets and ConfigMaps is the Dashboard responsible for creating today? I'm not aware of anything special that the dashboard creates when creating a project but I haven't dug around to check. Providing concrete examples of those kinds of things on the ADR would be very helpful.

if ODH has a presence in those DS Projects we need a mechanism to watch and manage any core ODH infrastructure in those projects

Is "core ODH infrastructure" the controllers that are deployed in opendatahub today? Or are you talking about CR's managed by the controllers?

What are you want to watch that can't be watched already? The Notebook Controller can see Notebook and the child objects it creates. DSP can see DPSA's and the different resources it manages? What does this proposed controller need to see that it can't be seen by other controllers already?

Since we are not proposing a new CR then it would be a label for namespaces where ODH components are present

Is this going to be the existing label the Dashboard uses or a new label?

What happens if the proposed controller does see a project that has an ODH related object in it that doesn't have the label already? Will the controller add the label automatically?

What happens when the label is added to a project or a new project is created with the label? Does the controller need to create something in that namespace? What is the purpose of that the thing that it creates?

shared Database

Sharing a database is an anti-pattern that should be avoided.

global ODH configuration

I would love to know more about what kinds of global configurations you are thinking about here. Presumably these would be things that need to be applied into each data science project somehow? What would prevent these from living in the opendatahub namespace?

LaVLaS commented 9 months ago

What happens if the proposed controller does see a project that has an ODH related object in it that doesn't have the label already? Will the controller add the label automatically? What happens when the label is added to a project or a new project is created with the label? Does the controller need to create something in that namespace? What is the purpose of that the thing that it creates?

All good questions which we'll update in the ADR.

Is "core ODH infrastructure" the controllers that are deployed in opendatahub today? Or are you talking about CR's managed by the controllers?

Core ODH infrastructure is a general reference to common configs or objects that affect the successful deployment or integration between ODH components within a DS-Project namespace. As of right now, the only thing planned is self-signed certificate bundles that need to be replicated across DS-Project namespaces.

What does this proposed controller need to see that it can't be seen by other controllers already?

This is a mechanism to provide any ODH global configurations that may need to be in place within DS-Projects before any dependant components are deployed to a namespace.

What are you want to watch that can't be watched already?

DS-Projects namespaces are not watched already

Sharing a database is an anti-pattern that should be avoided.

It is an example of a shared service in a DS-Project that two or more components may need but neither one owns the lifecycle. Presently, there is no ODH entity, outside of the Dashboard, that can ensure shared services or namespace configurations are applied correctly

I would love to know more about what kinds of global configurations you are thinking about here. Presumably these would be things that need to be applied into each data science project somehow? What would prevent these from living in the opendatahub namespace?

DS-Projects are for namespaced objects that are restricted to a namespace. That example would assume that a DS-Project user or service has access to the opendatahub namespace

github-actions[bot] commented 4 months ago

This PR is stale because it has been open 21 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 4 months ago

This PR was closed because it has been stale for 21+7 days with no activity.