aicoe-aiops / project-template

this is a template to use for new data science projects in the aiops group
Other
8 stars 21 forks source link

Request more resources by default #20

Closed tumido closed 4 years ago

tumido commented 4 years ago

With AIOps analytics deployment migration over to OCP prod (new projects are expected to be deployed there as well) a default resource limit is enforced on us. This limit is pretty strict:

$ oc describe limits aiops-prod-argo-limits
Name:       aiops-prod-argo-limits
Namespace:  aiops-prod-argo
Type        Resource  Min  Max  Default Request  Default Limit  Max Limit/Request Ratio
----        --------  ---  ---  ---------------  -------------  -----------------------
Container   memory    -    -    400Mi            1000Mi         -
Container   cpu       -    -    300m             500m           -

This limit results in an OOM kill on version analysis notebook of Openshift SME.

Luckily the Min and Max is not set so we can override these defaults on per step/container bases. I've chose these (hopefully reasonable) values which should be safe and suitable for most projects for now. The same PR will be opened against project-template as well.

This is a same PR as opened against Openshift SME https://github.com/aicoe-aiops/openshift-sme-mailing-list-analysis/pull/23