opendatahub-io / architecture-decision-records

Collection of Architectural Decision Records
Apache License 2.0
13 stars 33 forks source link

ODH-ADR-Operator-0004-replica-image-support #23

Closed AjayJagan closed 2 months ago

AjayJagan commented 10 months ago

Update DSC CRD with new customization values.

Description

Upgrade the DSC CRD to support customizing the replica count, images and modifying the resource limits/requests of individual components.

AjayJagan commented 10 months ago

@andrewballantyne , some questions

  1. If we are going to use HPA, then should we add it as a resource along with the other kustomize manifests?(the odh-manifests repo)
  2. Should this be configured for each component?
  3. If HPA is configured, then users need not worry about the replica count right? So there is no need to add a custom replica functionality ? 😅 (Please correct me if I am wrong)
AjayJagan commented 10 months ago

@andrewballantyne , some questions

  1. If we are going to use HPA, then should we add it as a resource along with the other kustomize manifests?(the odh-manifests repo)
  2. Should this be configured for each component?
  3. If HPA is configured, then users need not worry about the replica count right? So there is no need to add a custom replica functionality? 😅 (Please correct me if I am wrong)

As @VaishnaviHire says, can we also move the HPA configurations to devFlags as well? Also instead of directly modifying the replica count, can we plan to modify the minReplicas and maxReplicas? @ykaliuta ^

VaishnaviHire commented 10 months ago

@andrewballantyne , some questions

  1. If we are going to use HPA, then should we add it as a resource along with the other kustomize manifests?(the odh-manifests repo)
  2. Should this be configured for each component?
  3. If HPA is configured, then users need not worry about the replica count right? So there is no need to add a custom replica functionality? 😅 (Please correct me if I am wrong)

As @VaishnaviHire says, can we also move the HPA configurations to devFlags as well? Also instead of directly modifying the replica count, can we plan to modify the minReplicas and maxReplicas? @ykaliuta ^

I think HPA configurations/replicas can be updated by users, so they should not be included in the devFlags, just the images

AjayJagan commented 10 months ago

@andrewballantyne , some questions

  1. If we are going to use HPA, then should we add it as a resource along with the other kustomize manifests?(the odh-manifests repo)
  2. Should this be configured for each component?
  3. If HPA is configured, then users need not worry about the replica count right? So there is no need to add a custom replica functionality? 😅 (Please correct me if I am wrong)

As @VaishnaviHire says, can we also move the HPA configurations to devFlags as well? Also instead of directly modifying the replica count, can we plan to modify the minReplicas and maxReplicas? @ykaliuta ^

I think HPA configurations/replicas can be updated by users, so they should not be included in the devFlags, just the images

understood 👍

andrewballantyne commented 10 months ago

@andrewballantyne , some questions

  1. If we are going to use HPA, then should we add it as a resource along with the other kustomize manifests?(the odh-manifests repo)
  2. Should this be configured for each component?
  3. If HPA is configured, then users need not worry about the replica count right? So there is no need to add a custom replica functionality ? 😅 (Please correct me if I am wrong)

So let us start at the top-level... HPA will be extra work to configure (over just Replicas). If we ever intend to support non HPA deployments (maybe the backend teams don't want HPA?) then Replicas sound like something we will need to support -- so maybe we just start there until we understand HPA edges more.

HPA is (imo) a needed feature for the Dashboard & I imagine every component (but probably used less by many). The Dashboard can get hit by 1000s of users at the same time and our pod needs to scale for that. If 10 people use it at the same time, no need to spend the resources putting up several replicas. This can always be done through pure replica's, as determined by the Installing User (eg. "I have thoughts of 1000s of users concurrently using it vs I have an intent for only 10 to be using it at any point in time" -- which would be several replicas or just one)

I'm worried at this point that HPA will be entirely dictated by me for the Dashboard needs and I think we need to avoid a single component owner driving this. Maybe this is an AC topic for a future enhancements, and we go with Replicas for now.

Thoughts @VaishnaviHire @AjayJagan ?


EDIT: Let me answer your questions in case you guys do want to go ahead with HPA.

  1. The Operator will definitely need to entirely manage this, no implementation should come from any component, this is outside of their world
  2. Yes
  3. That's my hope -- but like I noted in the block above, does everyone want to be HPA-enabled?
AjayJagan commented 10 months ago

@andrewballantyne , some questions

  1. If we are going to use HPA, then should we add it as a resource along with the other kustomize manifests?(the odh-manifests repo)
  2. Should this be configured for each component?
  3. If HPA is configured, then users need not worry about the replica count right? So there is no need to add a custom replica functionality ? 😅 (Please correct me if I am wrong)

So let us start at the top-level... HPA will be extra work to configure (over just Replicas). If we ever intend to support non HPA deployments (maybe the backend teams don't want HPA?) then Replicas sound like something we will need to support -- so maybe we just start there until we understand HPA edges more.

HPA is (imo) a needed feature for the Dashboard & I imagine every component (but probably used less by many). The Dashboard can get hit by 1000s of users at the same time and our pod needs to scale for that. If 10 people use it at the same time, no need to spend the resources putting up several replicas. This can always be done through pure replica's, as determined by the Installing User (eg. "I have thoughts of 1000s of users concurrently using it vs I have an intent for only 10 to be using it at any point in time" -- which would be several replicas or just one)

I'm worried at this point that HPA will be entirely dictated by me for the Dashboard needs and I think we need to avoid a single component owner driving this. Maybe this is an AC topic for a future enhancements, and we go with Replicas for now.

Thoughts @VaishnaviHire @AjayJagan ?

EDIT: Let me answer your questions in case you guys do want to go ahead with HPA.

  1. The Operator will definitely need to entirely manage this, no implementation should come from any component, this is outside of their world
  2. Yes
  3. That's my hope -- but like I noted in the block above, does everyone want to be HPA-enabled?

Looking at it from your perspective, we may need to draw a line between frontend and backend apps(for using HPA) so if you ask me let's do it iteratively, so the first version of this can be replicas and maybe we only provide HPA configurability for the dashboard(or if other teams require this, we can extend the HPA functionality) -- this is just my thoughts. Any new idea is welcome 😄 . And thanks for the detailed explanation

jwforres commented 9 months ago

There is currently some context missing for the "why" behind changing the number of replicas. Many of the components that are installed, such as all the controllers, only need 2 replicas so there is an active and a passive that can immediately take over if the active were to fail. These would never be scaled up beyond 2. If the request is to scale these down to 1 to reduce deployment size for dev purposes, that sounds like devflags.

If we are talking about applications like the dashboard that are handling http requests and we want the customers to be able to tune the number of available replicas, then a field on the API that customers could use is reasonable. If that is the goal of replicas changes then we need to engage the component teams to know which components we actually want or are able to support that customization. If we put it on the API it is a supported feature.

andrewballantyne commented 9 months ago

@jwforres Perhaps this is just the inefficiency of the Dashboard pod 🤔 No backend component ever needs more pods if they have direct API calls in the 1000s concurrently? I may be off base here then.

Replicas for the Dashboard at least are 5 by default for RHOAI; to support Perf requests. ODH is 2 by default. Are we the only RHOAI deployment that needs to be controlled based on the number of users?

AjayJagan commented 9 months ago

So if I understand this correct Front-end = configurable replica count(maybe not it the devFlags) - In future we can consider HPAs Back-end = configurable replica count only in the devFlags(maybe we can have a limit set there to max = 2) Is this assumption right? @jwforres @andrewballantyne @VaishnaviHire ^

jwforres commented 9 months ago

I clarified in slack but will do so here as well - my comment about only 2 replicas was specific to controllers. Backend API servers may still require scaling beyond 2. With controllers only 1 replica is ever serving as the active one, so there is no benefit to going beyond having a single passive backup. That same situation isn't true with backend API servers that are serving requests.

AjayJagan commented 9 months ago

I clarified in slack but will do so here as well - my comment about only 2 replicas was specific to controllers. Backend API servers may still require scaling beyond 2. With controllers only 1 replica is ever serving as the active one, so there is no benefit to going beyond having a single passive backup. That same situation isn't true with backend API servers that are serving requests.

ok, now it's clear - the why... Thanks for the clarification 😄

bartoszmajsak commented 9 months ago

I understand that the main driver for this change is giving dev teams the ability to slim down deployments so they can be run in the constrained clusters. If that is the case I think this would be a great opportunity to rework devFlags and make them flexible. What I was thinking of is to make devFlag a *runtime.RawExtension field so that each component can model settings fitting their needs by providing their own structs matching exactly all the resources they wish to make configurable.

From a broader perspective, incorporating elements like replicas directly into the DSC API appears to be a somewhat leaky abstraction. This approach could lead to further additions of similar nature in the future, complicating the API. Additionally, the meaning of a parameter like replicas: 2 becomes ambiguous, particularly for components responsible for creating multiple Deployments (aside from controllers).

I think providing some reasonable defaults through means of HorizontalPodAutoscaler and perhaps PodDisruptionBudget and letting users fine-tune them might be a better option to start with if we consider some scaling/tuning efforts.

vaibhavjainwiz commented 9 months ago

I am very much agree with @bartoszmajsak suggestion to expose *runtime.RawExtension which could be easily customize to take differet input.

I am seeing this requirement as starting of various customization which could be required to expose for different component. Currently we are maintaining single CRD DataScienceCluster where we are trying to capture every customization of every component. Soon it would became big and nightmare to maintain.

Proposal In addition to @bartoszmajsak suggestion, I am proposing to maintain seperate CRDs for each ODH component where every component team design them as per their need. like Seperate CRD for:

For example: In ModelMesh Serving, there is no single image which user could just override. KServe component use to run 4 container in a pod so user could provide customize image for any of it. I am proposing to maintain something like below:

ModelServingConfiguration.yaml

apiVersion: opendatahub.io/v1
kind: ModelServingConfiguration
metadata:
  name: default
spec:
  odh-model-controller:
    replica: 1
    image: quay.io/vajain/odh-model-controller:2.0
    manifests:
      contextDir: config
      sourcePath: ''
      uri: 'https://github.com/opendatahub-io/odh-model-controller/tarball/main'
  kserve:
    replica: 2
    kserve-agent
      image: quay.io/vajain/kserve-agent:2.0
    kserve-storage-initializer
      image: quay.io/vajain/kserve-storage-initializer:2.0

@danielezonca

vaibhavjainwiz commented 9 months ago

Any progress on this one?

AjayJagan commented 8 months ago

@VaishnaviHire @andrewballantyne @ykaliuta I have updated the ADR a bit. Let me know how it looks :)

AjayJagan commented 8 months ago

Any progress on this one?

Hi @vaibhavjainwiz AFAIK, the idea of separate CRDs can be put into a separate ADR 😅 . wdyt?

strangiato commented 8 months ago

I'm catching up on this thread but wanted to weigh in on a couple of different things.

I'm 100% in alignment with @andrewballantyne on focusing on setting replicas for the dashboard and focusing on HPA afterwards. Replicas should be a much easier change and HPA's are a nice to have. I would recommend considering how HPAs might be integrated into that at a later phase.

Like @andrewballantyne said, having the dashboard scaled to 5 replicas is great for 1000 active users but most customers may have 10-15 users at any given time. On top of that I would say that 95% of the time users are working with OpenShift-AI they are working on things outside of the Dashboard (such as in a notebook and the OCP console) and only 5% of the time they are actively working in the Dashboard.

As for setting images, that was originally suggested before DevFlags existed. I don't expect it should be a "supported" feature for customers. I think that including it under the DevFlags is perfectly fine. I think that if customers want to deploy their own image, we probably need to better understand why they need their own image (e.g. adding custom certs to the image) and see about building that capability into the operator as a first class feature instead of just using their own image.

I have previously voiced some concerns around the DevFlag but I will say them again here. DevFlags seem like a bit of a kludgy way to say "this is supported and this isn't supported" instead of just documenting it. This creates a problems like maybe we do decide that custom images are supported for components down the road. Do we need to change the API spec to make that policy change if it is currently located under devFlags?

For the controllers I will make the argument that they should NEVER be more than a single replica. Having a standby replica is a waste of resources and IMO should have never been introduced. Controllers are designed around eventual consistency and there is no guarantee of "response time" from a controller. So what is the risk of a controller crashing with a single pod? The resource you are attempting to deploy takes 45 seconds instead of 30 seconds to deploy? The controller going down doesn't cause the resource it has deployed to stop working. So your model server API endpoint still functions if the modelmesh-controller goes down, it just won't get updated if the resource definition is modified until the pod comes back up.

So in my ideal solution all of the controllers should be 1 replica by default and there is no need to even add the replicas feature outside of the dashboard. If a component teams insists that they need an HA deployment of the controller, I need to have the ability to set it to 1 replica for Single Node Deployment or one where we have some resource sensitivity.

Regarding the ModelServingConfiguration idea... What value does having this as a separate object have over doing it in the existing DSC? It is still a one to one relationship between DSC and ModelServingConfiguration so it appears to me to just be moving the configuration from one object to another. I would prefer to keep the configuration of these types of items consistent across all components and keep them in a single location. If we want ant move to a unique object per component we should do away with the DSC entirely and only have unique objects for each component. Mixing the two is just going to cause confusion on "what should I set where?".

My ideal configuration of the DSC would look something like this:

apiVersion: datasciencecluster.opendatahub.io/v1
kind: DataScienceCluster
metadata:
  name: default
spec:
  components:
    codeflare:
      managementState: Removed
    dashboard:
      managementState: Managed
      dashboard: <1>
        autoscale: <2>
          enabled: true
          hpa:
            minReplicas: 1
            maxReplicas: 10
            metrics:
              - type: Resource
                resource:
                name: cpu
                target:
                  type: Utilization
                  averageUtilization: 50
    datasciencepipelines:
      managementState: Managed
      data-science-pipelines-operator-controller-manager:
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 200m
            memory: 256Mi
    kserve:
      managementState: Removed
    modelmeshserving:
      managementState: Managed
      odh-model-controller: <3>
        replicas: 1
        image: quay.io/vajain/odh-model-controller:2.0 <4>
      modelmesh-controller:
        replicas: 1
      devFlags:
        manifests:
          contextDir: config
          sourcePath: ''
          uri: 'https://github.com/opendatahub-io/odh-model-controller/tarball/main'
    ray:
      managementState: Removed
    workbenches:
      managementState: Managed

1 - The dashboard component today only deploys a single set of pods but by setting these options under spec.dashboard.dashboard we give ourselves more flexibility if the Dashboard chooses to add additional pods in the future as part of the Dashboard component. 2 - This autoscaling configuration is based on how ArgoCD handles setting up Autoscaling for it's components. I know RH has an aversion to "true/false" values in the CRD so that could be changed or maybe even just the presence of the autoscaling/hpa sections is enough to infer that we want it enabled. HPA could potentially be moved up to spec.dashboard.dashboard.hpa but I think keeping it under autoscaling probably gives us more flexibility in the long run. For example the controllers may benefit from having an option to setup VPA in the future. 3 - The modelmeshserving component is a good example of something that deploys multiple sets of pods with odh-model-controller and modelmesh-controller. We need to be able to control these independently so having the pattern of spec.<component>.<deployment> like with the dashboard makes a lot of sense here. 4 - Including the image here was always what I imagined before the devFlag idea was introduced. We could set this same value under spec.modelmeshserving.devFlags.odh-model-controller.image as well but it seems like a lot of redundant structure to the object. I don't think anyone will question that we aren't going to support a custom image that they built themselves though.

vaibhavjainwiz commented 8 months ago

@strangiato

If we want ant move to a unique object per component we should do away with the DSC entirely and only have unique objects for each component. Mixing the two is just going to cause confusion on "what should I set where?".

Same thought came to me as well to completly remove the DSC and ony define independent object for each component but earlier I restrict my self because it would be a big architectural change. But I believe this is the cleaner way of doing it. WDYT?

zdtsw commented 8 months ago

I have been thinking about this part:

modelmeshserving:
      managementState: Managed
      odh-model-controller: <3>
        replicas: 1
        image: quay.io/vajain/odh-model-controller:2.0 <4>
      modelmesh-controller:
        replicas: 1
      devFlags:
        manifests:
          contextDir: config
          sourcePath: ''
          uri: 'https://github.com/opendatahub-io/odh-model-controller/tarball/main'

if user prefer using devFlags, why would they set 2 deploymentsodh-model-controller and modelmesh-controller? it would be much easier to config whatever they want(image, replica, resources etc) in the remote tarball in one go.

so the question is, if the difficulty of using devFlags becomes a hinder which starts to introduce other options, then maybe we should not even keep devFlags, instead of having ".deployment." only (deploymentX will be provided in an official way so the users know what exactly to put there) something like:

modelmeshserving:
      managementState: Managed
      deployments:
        odh-model-controller:     // need to set to exact name to match whats been deployed in the cluster, or it wont work
           image: quay.io/my/image1:tag
           resources:
              requests:
                 cpu: 100m
                 memory: 128Mi
              limits:
                 cpu: 200m
                memory: 256Mi
        modelmesh-controller:         // if the deploymentX does not contains "controller" then replica value will be valid
           image: quay.io/my/image2:tag
           replica:  2         // here will fall back to 1 since it is a controller
           .....
AjayJagan commented 8 months ago

I have been thinking about this part:

modelmeshserving:
      managementState: Managed
      odh-model-controller: <3>
        replicas: 1
        image: quay.io/vajain/odh-model-controller:2.0 <4>
      modelmesh-controller:
        replicas: 1
      devFlags:
        manifests:
          contextDir: config
          sourcePath: ''
          uri: 'https://github.com/opendatahub-io/odh-model-controller/tarball/main'

if user prefer using devFlags, why would they set 2 deploymentsodh-model-controller and modelmesh-controller? it would be much easier to config whatever they want(image, replica, resources etc) in the remote tarball in one go.

so the question is, if the difficulty of using devFlags becomes a hinder which starts to introduce other options, then maybe we should not even keep devFlags, instead of having ".deployment." only (deploymentX will be provided in an official way so the users know what exactly to put there) something like:

modelmeshserving:
      managementState: Managed
      deployments:
        odh-model-controller:     // need to set to exact name to match whats been deployed in the cluster, or it wont work
           image: quay.io/my/image1:tag
           resources:
              requests:
                 cpu: 100m
                 memory: 128Mi
              limits:
                 cpu: 200m
                memory: 256Mi
        modelmesh-controller:         // if the deploymentX does not contains "controller" then replica value will be valid
           image: quay.io/my/image2:tag
           replica:  2         // here will fall back to 1 since it is a controller
           .....

@andrewballantyne @etirelli thoughts on this?

andrewballantyne commented 8 months ago

We need the tarball manifest uri -- this is key to overriding everything. The image is a QoL as an often use-case is "I want everything standard, but I want my PR's custom image, or some feature built image" ... so you have to fork and make a new manifest file. This is more hinderance to nightly images and things that are quickly updated than it is for pinned releases and updates.

I agree with Trevor, we should probably not look to bloat it up high -- but as I noted in the AC, devFlags have a lot of value and removing them is not a great idea imo. If we don't want to expand it, it is a flexible mechanism for it not to be a blocker going forward.

etirelli commented 8 months ago

@AjayJagan I agree with Trevor's points, except for having the image outside of devflags. The purpose of devflags is to allow developers like us to have the flexibility of overriding settings during the development cycle, without confusing users about what they should or should not use. Overriding an image is not something I expect users should ever do.

AjayJagan commented 8 months ago

@andrewballantyne @etirelli thanks for clarifying again. So we stick to the devFlags idea and may be iteratively incorporate Trevor's idea(apart from the images)... wdyt @zdtsw ?

zdtsw commented 8 months ago

@andrewballantyne @etirelli thanks for clarifying again. So we stick to the devFlags idea and may be iteratively incorporate Trevor's idea(apart from the images)... wdyt @zdtsw ?

If I understand @AjayJagan you correctly: 2 valid configurations for devFlags ? (for example 1)

modelmeshserving:
      managementState: Managed
      devFlags:
        odh-model-controller:     // need to set to exact name to match whats been deployed in the cluster, or it wont work
           image: quay.io/my/image1:tag
           resources:
              requests:
                 cpu: 100m
                 memory: 128Mi
              limits:
                 cpu: 200m
                memory: 256Mi
        modelmesh-controller:         // if the deploymentX does not contains "controller" then replica value will be valid
           image: quay.io/my/image2:tag
           replica:  2         // here will fall back to 1 since it is a controller
           .....

and example 2 (as what we are having already) :

modelmeshserving:
      managementState: Managed
      devFlags:
        manifests:
          contextDir: config
          sourcePath: ''
          uri: 'https://github.com/opendatahub-io/odh-model-controller/tarball/main'
AjayJagan commented 8 months ago

@andrewballantyne @etirelli thanks for clarifying again. So we stick to the devFlags idea and may be iteratively incorporate Trevor's idea(apart from the images)... wdyt @zdtsw ?

If I understand @AjayJagan you correctly: 2 valid configurations for devFlags ? (for example 1)

modelmeshserving:
      managementState: Managed
      devFlags:
        odh-model-controller:     // need to set to exact name to match whats been deployed in the cluster, or it wont work
           image: quay.io/my/image1:tag
           resources:
              requests:
                 cpu: 100m
                 memory: 128Mi
              limits:
                 cpu: 200m
                memory: 256Mi
        modelmesh-controller:         // if the deploymentX does not contains "controller" then replica value will be valid
           image: quay.io/my/image2:tag
           replica:  2         // here will fall back to 1 since it is a controller
           .....

and example 2 (as what we are having already) :

modelmeshserving:
      managementState: Managed
      devFlags:
        manifests:
          contextDir: config
          sourcePath: ''
          uri: 'https://github.com/opendatahub-io/odh-model-controller/tarball/main'

So the resources and replicas are planned to be moved out of devFlags. Only the image field will remain as part of the devFlags. And I think there should be 2 configurations 1 - The manifest URI 2 - The image(direct update)

Also as Trevor proposed, we can have individual sub-component names under each component and each sub-component can have its own config Eg.

      managementState: Managed
     odh-model-controller:
         resources:
                requests:
                   cpu: 100m
                   memory: 128Mi
                limits:
                   cpu: 200m
                  memory: 256Mi
       devFlags:
             image: quay.io/my/image1:tag
        modelmesh-controller:
            replica:  2 
           devFlags:
               image: quay.io/my/image2:tag
           .....

This way, in future if new components are added, we can assert fine-grained control over each sub-component.

We can make the images a key-value map. like odh-model-controller: quay.io/my/image1:tag so that can be configured in a single devFlag... I like this idea more :) Eg.

  managementState: Managed
  odh-model-controller:
    resources:
      requests:
          cpu: 100m
          memory: 128Mi
      limits:
          cpu: 200m
        memory: 256Mi
  modelmesh-controller:
      replica:  2
  devFlags:
    images:
      odh-model-controller: quay.io/image1:tag1
      modelmesh-controller: quay.io/image2:tag2

@zdtsw need your help here WDYT?

vaibhavjainwiz commented 8 months ago

Could we please create a seperate ADR to followup on images part

vaibhavjainwiz commented 8 months ago

In our last OpenShift AI - Architecture Council meeting we had some consensus that maintaining separate CRDs for each component would be a viable way to tackle current challenges. So to avoid polluting the current ADR I had started the discussion thread on slack to raise this point. https://redhat-internal.slack.com/archives/C05KDB2HFQQ/p1705989784479989

bartoszmajsak commented 8 months ago

In our last OpenShift AI - Architecture Council meeting we had some consensus that maintaining separate CRDs for each component would be a viable way to tackle current challenges. So to avoid polluting the current ADR I had started the discussion thread on slack to raise this point. redhat-internal.slack.com/archives/C05KDB2HFQQ/p1705989784479989

It would be great to use e.g. GitHub discussions instead of internal channels if we want to keep this open to the community.

vaibhavjainwiz commented 8 months ago

I had raised the ADR to propose seperate CRD idea. I believe changes request in ODH-ADR-Operator-0004-replica-image-support ADR are related to it.

AjayJagan commented 7 months ago

@LaVLaS I have updated the ADR with the changes having 1:1 mapping between the subcomponent names & the deployment. Please let me know if this looks fine :)

VaishnaviHire commented 6 months ago

Splitting this ADR to get more focussed reviews for configuring resources - https://github.com/opendatahub-io/architecture-decision-records/pull/32

github-actions[bot] commented 2 months ago

This PR is stale because it has been open 21 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 2 months ago

This PR was closed because it has been stale for 21+7 days with no activity.