appveyor / ci

AppVeyor community support repository
https://www.appveyor.com
344 stars 64 forks source link

BYOC Docker is broken with Docker v23 and above #3891

Open shana opened 9 months ago

shana commented 9 months ago

The public version of the host agent is broken with current versions of docker, due to #3869 not being fixed in the public build. Additionally, the private build shared in #3869 has an incomplete fix, crashing with:

Starting Docker job...
Index was outside the bounds of the array.

This is happening because the fix changes the amount of fields returned from the Docker version format, but the code that processes it splits the returned string and assumes that it still has the amount of fields prior to the fix. i.e., could you please update Appveyor.BuildWorkers.Docker.DockerWorker. GetDockerVersion to use the correct array indexes, and publish a public build with the full fix?

Since the original issue has been closed even though it isn't fixed, I'm opening this one to request an official fix for this.

mikehutter commented 6 months ago

I'm attempting to evaluate AppVeyor BYOC and I have this issue with as well.

Docker Engine: 24.0.7 Docker Desktop: 4.26.1

Starting Docker job...
template: version:1:44: executing "version" at <.Client.Experimental>: can't evaluate field Experimental in type system.clientVersion
FeodorFitsner commented 6 months ago

Could you please clarify are you trying standalone AppVeyor Server or BYOC connected to AppVeyor cloud?

mikehutter commented 6 months ago

This is stand alone.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Feodor Fitsner @.> Sent: Tuesday, December 19, 2023 4:41:01 PM To: appveyor/ci @.> Cc: Mike Hutter @.>; Comment @.> Subject: Re: [appveyor/ci] BYOC Docker is broken with Docker v23 and above (Issue #3891)

Could you please clarify are you trying standalone AppVeyor Server or BYOC connected to AppVeyor cloud?

— Reply to this email directly, view it on GitHubhttps://github.com/appveyor/ci/issues/3891#issuecomment-1863571571, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABJ3ZOCUKIS27566G6EQ5HDYKIJXVAVCNFSM6AAAAAA425PYDSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRTGU3TCNJXGE. You are receiving this because you commented.Message ID: @.***>

FeodorFitsner commented 6 months ago

Try installing this AppVeyor Server update please: https://appveyordownloads.blob.core.windows.net/appveyor/7.0.3279/appveyor-server-7.0.3279-win-x64.msi

mikehutter commented 6 months ago

Downloaded and installed and I can see the new dlls in the server folder with a date of 6/1/23 and but I'm still getting the exact same error. Is it the host agent that needs to be updated? Those dlls look to still be from 4/2022.

mikehutter commented 6 months ago

I started from scratch, uninstalled everything, re-installed and before I added a new Docker cloud build I got the same error as the OP @shana.

Starting Docker job... Index was outside the bounds of the array.

I then added a new docker build environment and ran the suggested power shell commands which installed the HostAgent and that HostAgent is using the old dlls (7.0.3212) vs the new ones in the server folder (7.0.3279) and I get the same error I did originally did...

Starting Docker job... template: version:1:44: executing "version" at <.Client.Experimental>: can't evaluate field Experimental in type system.clientVersion

Update: I was able to discern the location of the HostAgent v 3279 download from the powershell cmdlets and was able to download appveyor-host-agent-7.0.3279-win-x64.zip. I replaced the files in the HostAgent with the files from the new version, restarted the service and now I am back to this error:

Starting Docker job... Index was outside the bounds of the array.

mikehutter commented 6 months ago

Update:

I "patched" the HostAgent/AppVeyor.BuildWorkers.Docker.dll so it reads the correct indexes for the string array returned from GetDockerVersionString and it connects to and starts a docker build container, however it doesn't really do anything.

In the build console, it just says: Waiting for container to start the job...

and on the docker container logs:

warn: Appveyor.BuildAgent.Service.SignalrClientFactory[0] 2023-12-20 14:46:05 Error fetching AppVeyor version via /api/version endpoint: No connection could be made because the target machine actively refused it. No connection could be made because the target machine actively refused it. 2023-12-20 14:46:10 Reading worker metadata 2023-12-20 14:46:10 Appveyor URL: http://localhost:8050 2023-12-20 14:46:10 Worker ID: 662b8945cfbb4f24a5c0dea2c1121bbe 2023-12-20 14:46:10 Connecting to Appveyor via SignalR... 2023-12-20 14:46:10 SignalR connection starting... 2023-12-20 14:46:10 info: Microsoft.Hosting.Lifetime[0] 2023-12-20 14:46:10 Now listening on: http://localhost:49156 2023-12-20 14:46:10 info: Appveyor.BuildAgent.Service.Startup[0] 2023-12-20 14:46:10 API server started 2023-12-20 14:46:10 info: Microsoft.Hosting.Lifetime[0] 2023-12-20 14:46:10 Application started. Press Ctrl+C to shut down. 2023-12-20 14:46:10 info: Microsoft.Hosting.Lifetime[0] 2023-12-20 14:46:10 Hosting environment: Production 2023-12-20 14:46:10 info: Microsoft.Hosting.Lifetime[0] 2023-12-20 14:46:10 Content root path: C:\Program Files\AppVeyor\BuildAgent 2023-12-20 14:46:12 SignalR disconnected 2023-12-20 14:46:17 SignalR connection starting... 2023-12-20 14:46:19 SignalR disconnected 2023-12-20 14:46:24 SignalR connection starting...

Guessing that localhost:8050 is in the context of the docker container so it's attempting to connect to the server on itself rather than my host machine. Investigating...

mikehutter commented 6 months ago

Update:

The issues described by the OP are still valid. These are the issues that I've had attempting to get this up and running on my machine and describe in detail the steps I had to go through to get it working.

My Setup: Windows 10 19045.3803 Docker Desktop: 4.26.1 Docker Engine: 24.0.7 AppVeyor Server: 7.0.3212 (This is what is/was "Publicly Available" via the website.) AppVeyor Host Agent: 7.0.3212 (This is what is installed via powershell cmdlet when adding a docker cloud.)

AppVeyor Server and Host Agent are running on my machine (not inside docker). Docker Desktop is installed and running on this same machine. It was setup this way simply for evaluation purposes.

Remediation Steps : Per support's request, I updated AppVeyor server to 7.0.3279 (link above), however the issue remained unchanged. I still got the error message from #3869. I checked the dlls in the HostAgent folder (vs Server) and they were still 7.0.3212.

After some digging in the powershell scripts, I discerned the url to download AppVeyor Host Agent 7.0.3279 and replaced the files in the HostAgent folder with the new ones.

This produced the error message when running a build: Starting Docker job... Index was outside the bounds of the array.

I "patched" Appveyor.BuildWorkers.Docker.DockerWorker.GetDockerVersion using ildasm/ilasm and replaced the file in the HostAgent (not Server) directory.

The error messages went away, however the docker build container was not able to communicate with they AppVeyor Server api.

This finally allowed me to start a build that was able to connect to and communicate with my AppVeyor Server api.

I am now able to successfully start and run a build.

Related Issues:

3869 - caused by v7.0.3212 Appveyor.BuildWorkers.Docker.DockerWorker

3891 - (This Issue) - caused by v7.0.3279 Appveyor.BuildWorkers.Docker.DockerWorker

Note: