openfaas / faas-netes

Serverless Functions For Kubernetes
https://www.openfaas.com
MIT License
2.12k stars 473 forks source link

Allow setting tolerations on function pods #358

Closed manuelrombach closed 4 years ago

manuelrombach commented 5 years ago

In Kubernetes nodes can be tainted to only accept pods which have the matching tolerations. Currently it is not possible to set tolerations on the function pods. Therefore these pods can not be scheduled on tainted nodes.

Expected Behaviour

It should be possible to schedule function pods on tainted nodes.

Current Behaviour

It is not possible to set tolerations on function pods.

Possible Solution

Tolerations could be set in the stack.yaml, similiar to e.g. constraints.

Steps to Reproduce (for bugs)

  1. Set a taint on k8s nodes (e.g. kubectl taint nodes node1 somekey=somevalue:NoSchedule)
  2. Openfaas Function-Pods will not be scheduled on these nodes, as they don't have matching tolerations (docs: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/)

Context

Your Environment

mercul3s commented 5 years ago

I'm also running into this issue. We have an existing cluster with tainted preemptible node pools, and ideally would like to schedule only functions on preemptible machines using tolerations. As a workaround, I've created deployments with tolerations for each function, and then created a function resource for each. This works, but means I have to maintain a deployment spec in addition to function specs and containers, and I would prefer to let openfaas generate and update the deployments instead.

I'm a Go developer, so I poked around in the codebase a bit, and it looks like tolerations could be added to the deployment spec struct: https://github.com/openfaas/faas-netes/blob/master/handlers/deploy.go#L201-L261, though there's probably a bit more to it than that. Would folks be interested in a PR for this feature?

alexellis commented 5 years ago

This has been tracked at the following issue: https://github.com/openfaas/faas/issues/1125

@mercul3s we're waiting on your code sample / example. How is that going?

alexellis commented 5 years ago

@Dread1982 please fill out a full example in Steps to Reproduce (for bugs)

Not everyone in the community is a Kubernetes expert, but they may have a good working knowledge of Go. By honouring the template you increase the chances that someone can fix your issue and test it.

alexellis commented 5 years ago

-- Join Slack to connect with the community https://docs.openfaas.com/community

alexellis commented 5 years ago

@mercul3s pinging on this again 👋

alexellis commented 5 years ago

Related discussion is on-going across several issues including this one in the CLI, which needs a decision to move forward - https://github.com/openfaas/faas-cli/issues/623

andeplane commented 4 years ago

I work at Cognite, a norwegian based tech company, and we are interested in this as well. We use the same k8s cluster as other services, but want to deploy functions using OpenFaaS on a separate node pool to use preemptible nodes and prevent high load to affect our core services.

We have patched faas-netes internally to do this so far by adding

Tolerations: []apiv1.Toleration{
    {Operator: "Equal", 
    Effect: "NoSchedule", 
    Key: "dedicated", 
    Value: "cognite-functions"},
},

in the pod specifications at https://github.com/openfaas/faas-netes/blob/master/handlers/deploy.go#L214.

alexellis commented 4 years ago

Closing in favour of https://github.com/openfaas/faas-netes/issues/586

alexellis commented 4 years ago

/lock