Closed 6543 closed 11 months ago
@6543 great work, can I test it somehow? Is it possible to include any resources into it, e.g. nvidia gpus ?
hmm @maltegrosse at the moment I would say, best help is to test and point out it's limitations/issues.
passthrough hardware axeleration like gpus, I never thought of. if it's about help via $, we have an openCollective account
@6543 is there a different behavior regarding cpu/mem resources and other resources available on the node?
https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/
eg:
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/utils/gpu/gpu.go#L113
vs.
I guess gpu's are just not jet taken into account - but that's an interesting usecase
@maltegrosse as long as the cluster has device plugins providing resources such as GPUs, it's pretty much the same as defining a resource limit for CPU/memory.
@6543 I seeing great progress in k8s backend support. I think I wanna give it a try and update from 0.x to the latest stable version. But now I just saw that 2.0 is in the pipeline. Would you recommend me to wait for the new 2.x major release? if yes, is there any ETA ?
You're right that 2.0 is in progress, but I don't think it's a bad idea to upgrade to 1.0 first. 2.0 will contain some breaking changes, but not as much as 1.0.
We're currently somewhat stuck at #2476 but after this is done we would like to release 2.0. This can take a week to a month, there's no fixed ETA.
WRT to k8s specifically: I'd say it's useable for production needs, at least I do so across many instances and a few dozen repos with some complex configs.
sounds nice, thanks for the feedback @pat-s . the only point which confuses me are the resource limits. Seems like every step requires the resources definitions - or can I simple add it to any agent globally by using normal k8s syntax? (so it will be applied to any job) - see https://woodpecker-ci.org/docs/next/administration/backends/kubernetes#resources The reason is that i dont trust my users that they can assume / predict proper resource usage :)
Addionally, are there any breaking changes regarding my db (postgres) ? (currently using 0.15.6)
I couldnt find anything at https://woodpecker-ci.org/docs/next/migrations
This means there's nothing. Of course, some db migrations will run, but woodpecker handles this automatically on first start after update.
Seems like every step requires the resources definitions
Yes, that's the case and probably won't change in the future.
or can I simple add it to any agent globally by using normal k8s syntax?
No, it must be added to each pipeline and their steps. Also, it's not about the runner (which is a separate deployment) but about the pods spawned by the runner. These are defined by the respective pipelines.
The reason is that i dont trust my users that they can assume / predict proper resource usage
By default, resources are not set (as it is the case for any other k8s resource). And yeah, teaching is needed. I feel you however, I have the same issues in my environment WRT to users ;)
thank you @pat-s Have you played with Resource Quotas? Havent tried with it yet, but could somehow limit the damage :-)
Seems like an interesting idea, maybe we can implement this in the helm chart, so it can be applied across the namespace WP is running in. Thanks for sharing the idea!
And setting up a default resource definition for each step is not an option at all for WP?
As Resource Quotas mention:
For cpu and memory resources, ResourceQuotas enforce that every (new) pod in that namespace sets a limit for that resource. If you enforce a resource quota in a namespace for either cpu or memory, you, and other clients, must specify either requests or limits for that resource, for every new Pod you submit. If you don't, the control plane may reject admission for that Pod.
or are Limit Ranges exactly to solve that issue? seems like if I look at the first example
close this as we shoudl have full support now - if there are still issues they are considerated normal bugs :)
@pat-s I finally upgraded to wp2, on kubernetes, works great! (inlcuding resource limits for GPU)
resources are limited by LimitRange:
apiVersion: v1
kind: LimitRange
metadata:
name: compute-limits
namespace: woodpecker
spec:
limits:
- default:
cpu: 12
memory: 40Gi
nvidia.com/mig-2g.20gb: 1
type: Container
thank you all again!
nice :heart:
I'll lock this issue as we now have kube support :) future interactions should be new issues.
basic support (#9) was added with #552
Current state:
storageclasses