OSC / ood_core

Open OnDemand core library
https://osc.github.io/ood_core/
MIT License
10 stars 29 forks source link

k8s: Resource limits and requests on init containers (ResourceQuotas) #858

Closed nathanlcarlson closed 2 hours ago

nathanlcarlson commented 3 hours ago

I'd like to specify a ResourceQuota on each user's namespace, but this requires that requests and limits are specified on init-containers as well. This does not seem to currently be supported. I could contribute this if there is interest, but in the mean time I'd like to get the feature deployed to my local install. Is there a convenient way to do this?

Also, more generally, are ResourceQuotas advisable for k8s-based OOD install? Would it be worth contributing the idea of using them to the examples provided in the docs?

johrstrom commented 3 hours ago

I could contribute this if there is interest

PRs always welcome. I'm not familiar with the ResourceQuotas, but I'd guess if you need them/use them then others are likely to as well.

Is there a convenient way to do this?

Not a convenient way no. You have 2 options:

  1. Replace the existing files in-situ. That is find this gem in /opt/ood/<somewhere> and just replace the files after you've installed. Since it's ruby (an interpreted language) you can just replace the files and they'll get loaded, no need for compilation beforehand.
  2. Use monkey patches. You can see an example of us doing this on our system. Here I'm able to redefine a method in a rails initializer that OOD will load and define/redefine anything. This specifically isn't needed anymore (ood_core was patched) but I just never got around to removing it.

https://github.com/OSC/osc-ood-config/blob/master/class.osc.edu/apps/dashboard/initializers/k8s_core.rb

Also, more generally, are ResourceQuotas advisable for k8s-based OOD install? Would it be worth contributing the idea of using them to the examples provided in the docs?

I'm not familiar with ResourceQuotas directly, but I suspect we do use them - IIRC we have limits on how many running pods can be in a namespace, though I'd have to lookup how we do this.

treydock commented 3 hours ago

I think a ResourceQuota would be a good addition, especially here: https://github.com/OSC/ondemand/tree/master/hooks/k8s-bootstrap/yaml. These YAML files are used to bootstrap each user's namespace so having a ResourceQuota there would ensure each OOD namespace gets that quota.

Does a ResourceQuota work if you define a LimitRange with defaults when initContainers don't specify limits? At OSC we use Kyverno to inject limits as well as enforce per-pod limits on resources but we've also used LimitRange like this to ensure sane defaults for places where we may not control the pod deployment:

apiVersion: v1
kind: LimitRange
metadata:
  <metadata>
spec:
  limits:
  - default:
      cpu: "1"
      memory: 2Gi
    defaultRequest:
      cpu: "1"
      memory: 2Gi
    type: Container
treydock commented 3 hours ago

Also there are ways to enforce these extra resources if you use a tool like Kyverno. Example: https://kyverno.io/policies/best-practices/add-ns-quota/add-ns-quota/

nathanlcarlson commented 2 hours ago

@treydock A quick test indicates that yes, a LimitRange resource defined similarly to what you shared above does get applied to the init-containers and satisfies the ResourceQuota requirements.

I have become familiar with that k8s-bootstrap directory as I've created my own edition based on the examples provided there. I can pick some of the things out of there, including this LimitRange+ResourceQuota yaml, and make a PR over there at some point. I'd like to wait until our edition has had some real usage to make sure things work as expected.

This LimitRange seems like an acceptable solution to this issue, so I don't think I'll make any code modifications.

Thanks for the prompt help.

treydock commented 2 hours ago

I created https://github.com/OSC/ondemand/issues/3900 to get these new resource types added to the Kubernetes bootstrap logic. Will close this issue out since any work would be in the OnDemand repo.