JetBrains / teamcity-docker-agent

TeamCity agent docker image sources
https://hub.docker.com/r/jetbrains/teamcity-agent/
Apache License 2.0
77 stars 64 forks source link

NuGet Publish build step fails #39

Closed tg73 closed 5 years ago

tg73 commented 5 years ago

After an otherwise successful build, I'm getting failures with the NuGet Publish step. I'm using NuGet.exe version 4.7.1, and a image based on jetbrains/teamcity-agent:2018.1.1-windowsservercore-1803 (with added Windows SDK for debugging tools to support symbol server plugin). I am running the build agent image as a swarm of scale 1 (for testing), with default networking. The log looks like this:

[01:45:35]  Step 10/11: NuGet Publish (1m)
[01:45:36]  [Step 10/11] Attempt to publish symbol package. Symbol packages are not fully supported by TeamCity internal feed. For more details see https://confluence.jetbrains.com/display/TCDL/NuGet#NuGet-symbols
[01:45:36]  [Step 10/11] push: Publish package Api.Logging\bin\Release\Redacted.Api.Logging.Redist.2.0.0-alpha-00000056.symbols.nupkg (1m)
[01:45:36]  [push] NuGet command: C:\BuildAgent\tools\NuGet.CommandLine.4.7.1\tools\NuGet.exe push C:\BuildAgent\work\2c2ee6feb437097a\Api.Logging\bin\Release\Redacted.Api.Logging.Redist.2.0.0-alpha-00000056.symbols.nupkg ******* -Source http://redacted.redactedinternaldomain/nuget/internal
[01:45:36]  [push] Starting: C:\BuildAgent\temp\agentTmp\custom_script2499889723989314163.cmd
[01:45:36]  [push] in directory: C:\BuildAgent\work\2c2ee6feb437097a\Api.Logging\bin\Release
[01:45:38]  [push] Pushing Redacted.Api.Logging.Redist.2.0.0-alpha-00000056.symbols.nupkg to 'http://redacted.redactedinternaldomain/nuget/internal'...
[01:45:38]  [push]   PUT http://redacted.redactedinternaldomain/nuget/internal/
[01:45:57]  [push] An error was encountered when fetching 'PUT http://redacted.redactedinternaldomain/nuget/internal/'. The request will now be retried.
[01:45:57]  [push] An error occurred while sending the request.
[01:45:57]  [push]   The underlying connection was closed: An unexpected error occurred on a receive.
[01:45:57]  [push]   Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
[01:45:57]  [push]   A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
[01:45:58]  [push]   PUT http://redacted.redactedinternaldomain/nuget/internal/
[01:46:17]  [push] An error was encountered when fetching 'PUT http://redacted.redactedinternaldomain/nuget/internal/'. The request will now be retried.
[01:46:17]  [push] An error occurred while sending the request.
[01:46:17]  [push]   The underlying connection was closed: An unexpected error occurred on a receive.
[01:46:17]  [push]   Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
[01:46:17]  [push]   A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
[01:46:17]  [push]   PUT http://redacted.redactedinternaldomain/nuget/internal/
[01:46:36]  [push] The underlying connection was closed: An unexpected error occurred on a receive.
[01:46:36]  [push]   Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
[01:46:36]  [push]   A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
[01:46:36]  [push] Process exited with code 1
[01:46:37]  [push] Process exited with code 1 (Step: )
[01:46:36]  [Step 10/11] Step NuGet Publish failed

Host redacted.redactedinternaldomain is a ProGet server running within our internal network as a standalone server (no docker involved). The actual domain name does not end in .local (I'm aware of some issues with that suffix).

Running the exact same nuget push from a powershell console on the Windows Server 2016 docker host machine works as expected.

If I obtain a powershell console on the running docker container and run the exact same nuget push explicitly (with the correct -ApiKey argument added), I get the same error. However, running nuget list -Source http://redacted.redactedinternaldomain/nuget/internal works as expected. I find this quite puzzling: from testing on a Windows 10 desktop with Fiddler, nuget list issues a set of HTTP GET requests, and receives responses; nuget push issues a single HTTP PUSH and receives a response. From looking at firewall logs on the docker host Windows Server (disclaimer: I am not a networking expert), several incoming UDP packets get blocked when I issue the nuget push - almost as if the firewall does not recognise the incoming UDP packets as being part of an established outbound connection. Why this would differ for push vs list I have no idea.

It would be good to know if the NuGet Publish step is working for other people under similar conditions.

dtretyakov commented 5 years ago

@tg73, such kind of problems could be caused by issue with MTU setting in docker network. Related issues could be found there:

So try settings MTU value to 1400 on your docker host machine and check whether it helps.

tg73 commented 5 years ago

@dtretyakov, thanks for your expert tip - adjusting the MTU to 1400 on the container's virtual interface has fixed the problem. From my reading of the various pages you linked (and docker documentation), there is no way to configure docker to apply this MTU as a general default (specifically, this is a Windows Containers problem). In my scenario, because I'm already using a custom image, I added the following to my Dockerfile:

SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]

CMD Get-NetIPInterface | Where-Object -FilterScript { $_.InterfaceAlias -like 'vEthernet (Ethernet)*' } | Set-NetIPInterface -NlMtuBytes 1400 ; \
    ./BuildAgent/run-agent.ps1

This ensures that any new container instance gets the correct MTU setting.