SeldonIO / seldon-core

An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
https://www.seldon.io/tech/products/core/
Other
4.37k stars 831 forks source link

Add tolerations and nodeSelector to seldon-core-operator helm chart #5958

Open yavorivanov-cw opened 3 weeks ago

yavorivanov-cw commented 3 weeks ago

We are currently using seldon-core-operator and would like to add tolerations and nodeSelector to the seldon-controller-manager.

We are using the chart like this:

helm upgrade --install seldon-core seldon-core-operator \
   --repo https://storage.googleapis.com/seldon-charts \
   --set usageMetrics.enabled=false \
   --set istio.enabled=false \
   --create-namespace \
   --namespace seldon-system
matthewlowdon commented 1 week ago

Hi @yavorivanov-cw, thanks for reaching out! We'd love to understand more about your specific needs so we can help out.

Could you share a bit more about the context for your request? For example:

Knowing more will help us figure out the best way to support you. It may be that Seldon Core 2 has features that fit your requirements.

yavorivanov-cw commented 1 week ago

Hi @yavorivanov-cw, thanks for reaching out! We'd love to understand more about your specific needs so we can help out.

Could you share a bit more about the context for your request? For example:

* The exact tolerations and nodeSelector settings you're looking to add?

* What's prompting the need for these configurations in your environment?

* Any challenges you're facing with the current setup?

Knowing more will help us figure out the best way to support you. It may be that Seldon Core 2 has features that fit your requirements.

Hi @matthewlowdon and thanks for responding.

We would like to have the following tolerations and affinity set:

  tolerations:
  - key: "CriticalAddonsOnly"
    operator: "Equal"
    value: "true"
    effect: "NoSchedule"
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: preferred-workload
            operator: NotIn
            values:
              - models
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 80
          preference:
            matchExpressions:
            - key: kubernetes.azure.com/agentpool
              operator: In
              values:
                - system

We try to schedule all of our core components (like seldon) on a specific node pool. The core components still can run on other nodes but the affinity should help us to be more stable. It is currently working as it is but as I said we would like to be more flexible and tell the scheduler where we want our seldon.

matthewlowdon commented 3 days ago

Hi @yavorivanov-cw, thanks for your reply. I believe in this case it would be beneficial for you to adopt Core 2, which was designed to support this type of configuration.

You can find more information here: https://docs.seldon.ai/seldon-core-2/resource-allocation