Open mtupitsyn opened 2 years ago
@denniswalker should probably also be a reviewer on this given the budget implications.
@mtupitsyn : Possible complications with builds which invoke docker/podman. Essentially, for such builds we'll need to run docker-in-docker, i.e. connect to host node docker daemon via local socket. Depending on CRI implementation GKE is using, we may not be able to support all combinations. For example, builds using docker-compose may not be able to connect to host containerd.
The HMS builds use a LOT of docker in docker and docker-compose. (Im not sure how we will replace things like this as the docker desktop licensing goes away (maybe podman (although it doesnt support volumes šš» )). I digress. HMS needs the ability to do 'docker' based builds, very heavily. Its a 'must have' for our 40 builds.
@nieuwsma I can confirm that docker-in-docker works, I had been using this configuration for quite a while. It has some issues with multi-tenancy (imagine containers from different tenants accessing the same instance of docker daemon on the host), but otherwise ok. Podman is supposed to eliminate exactly these types of issues with multi-tenancy.
I support this proposal.
Abstract
Currently we have single, statically configured, self hosted runner, baked by a VM in GCP. For better scalability, resiliency and security, we need to support ephemeral runners. To minimize workflow execution time and optimize costs, solution must also provide auto-scaling.
Problem Statement
GitHub documentation refers to 2 opensource projects with auto-scaling support:
We can also come up with home-grown solution for auto-scaling.
This proposal is to choose 1st option from the list above, as option 2 is not suitable for us, and home-grown solution will require a lot of development and collaboration efforts.
Advantages of this approach:
Disadvantages/Caveats:
Internal References
External References
Proposed Solution(s)
Impact of Action/Inaction
Static self-hosted runner is going to be a bottleneck when amount of Github workflow executions arises.
Further Information
Suggested Reviewers
Comment Period
Comment period for this proposal shall close on November 10, 2021.