jenkinsci / docker-agent

Jenkins agent (base image) and inbound agent Docker images
https://hub.docker.com/r/jenkins/inbound-agent/
MIT License
287 stars 230 forks source link

Windows agent does not work on AWS Fargate #572

Open kristofdho opened 12 months ago

kristofdho commented 12 months ago

Jenkins and plugins versions report

N/A

What Operating System are you using (both controller, and any agents involved in the problem)?

Controller: Linux Agent: Windows

Reproduction steps

  1. Set up the Amazon ECS plugin https://plugins.jenkins.io/amazon-ecs/
  2. Set up a Windows agent
  3. Launch Windows agent

Expected Results

Windows Agent starts up

Actual Results

ECS Fargate error when launching the taks: CannotCreateVolumeError: unsupported: Dockerfile contains VOLUME instruction

Anything else?

Related issue on the plugin itself: https://github.com/jenkinsci/amazon-ecs-plugin/issues/334 But the root cause is in this repo, as ECS Fargate does not support VOLUME instructions in the image, on Windows. So the quick and easy fix is to remove these 2 lines: https://github.com/jenkinsci/docker-agent/blob/798ce6450f7143c341e236e5962becf0b5e1b864/windows/windowsservercore/Dockerfile#L98-L99

However I suspect simply removing them will not an option. I however have no idea what a satisfactory solution would be.

Are you interested in contributing a fix?

The fix I can contribute is to remove the 2 aforementioned lines. For anything else I'll need some pointers.

kristofdho commented 8 months ago

@timja appologies for the direct tag, only doing so since i've seen you active on other issues/prs. Is it possible to get someone to look into this seemingly easy to solve issue? It's blocking us from updating base images since we don't have the resources to maintain our own builds for this without the VOLUME instructions.

I'm willing to look into it myself (as mentioned, only fix seems to be removing the instructions), but I'd need at least some feedback on if that's the right way forward, or if there is an alternative that can be considered.

Could you point me to the right person if there is someone else I should contact instead?

dduportal commented 8 months ago

Hi @kristofdho and thanks for raising this issue.

Removing the VOLUME directives would break the installation of many users. These instructions are there for a purpose and that would be a breaking change, not mentioning an un-welcomed one.

We are really sorry for the problem you are facing: it looks like ECS should not be considered a solution for running Windows container agents given how little support they provide today: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/windows_task_definitions.html.

I'm willing to look into it myself (as mentioned, only fix seems to be removing the instructions), but I'd need at least some feedback on if that's the right way forward, or if there is an alternative that can be considered.

You have to maintain your own ECS agent version. I'll take care of bringing the topic of "do we want - and can we - as the Jenkins community, to maintain a Windows ECS image version". But if you don't have the resources to be able to maintain your own image, it means you're going to have a lot of trouble with this current architecture.

I suggest you, in the meantime, to look into using the EC2 plugin to spin up Windows ephemeral VM agents. We've used it for years on the Jenkins infrastructure (for ci.jenkins.io) and it worked very well: it is not more expansive or cheaper than running ECS tasks and could solve your problem easily.

kristofdho commented 8 months ago

We specifically moved away from the EC2 setup as it's harder to maintain than having it based on docker containers. Would you be willing to publish a separate -ecs version of the container without the volume mounts? That would leave the current solution as is, and provide a working base image for ECS on Windows.

dduportal commented 8 months ago

We specifically moved away from the EC2 setup as it's harder to maintain than having it based on docker containers. Would you be willing to publish a separate -ecs version of the container without the volume mounts? That would leave the current solution as is, and provide a working base image for ECS on Windows.

In theory, adding a new image declination would solve your problem yes!

Alas, given https://github.com/jenkins-infra/helpdesk/issues/4029, I will veto to this as Jenkins Infrastructure officer because the (big) amount of image declination we provide is hitting the DockerHub really hard, until the DockerHub problem is fixed.

It's not against you, trust me. I understand you are blocked but keep in mind that we are non-profit project and paying for build time, storage time is a challenge to us while most private company can use us for free (and same for DockerHub).

As such, I'll bring the "ECS support for Windows" topic to the next platform SIG meeting to see what remediation we could do as a community: you are welcome to join us and help us by participating!

We specifically moved away from the EC2 setup as it's harder to maintain than having it based on docker containers.

I personally disagree as an SRE: we ran Windows EC2 machines for the past 4 years and it was clearly easier than having Windows container images. And the non-complete support of Windows container in ECS shows that it might be a bit too early for making such a change.

If it helps, I can share the kind of setup we used to have (customizing Windows AMI with packer + Jenkins controller JCasc EC2 configuration).

kristofdho commented 8 months ago

I'll bring the "ECS support for Windows" topic to the next platform SIG meeting to see what remediation we could do as a community

Thank you, I do understand that adding more images may not be as straightforward as it looks. I appreciate the follow-up.

If it helps, I can share the kind of setup we used to have

This would be helpful indeed. I doubt we'll have the resources to spend on a reverse migration any time soon, but at least we'll know our options. As for the non-complete Windows support, that has been a continuous frustration for years already and I doubt it's changing anytime soon.

If there's somehow anything I can do to help, do let me know.