NuGet / Home

Repo for NuGet Client issues
Other
1.49k stars 252 forks source link

NuGet 3.3.exe returns non-zero exit code on successful publish #1813

Closed Aaronontheweb closed 8 years ago

Aaronontheweb commented 8 years ago

All of our build systems that depend on FAKE have now begun experiencing an error during NuGet push attempts as a result of changes made to NuGet.exe itself.

Sample logs:

Pushing nuget packages
[17:45:42][Step 1/1] Pushing D:\BuildAgent\work\2de53b95e0be9250\bin\nuget\NBench.0.1.0.nupkg Attempts left: 3
[17:45:42][Step 1/1] Pushing NBench 0.1.0 to the NuGet gallery (https://www.nuget.org)...
[17:45:47][Step 1/1] Your package was pushed.
[17:45:47][Step 1/1] Pushing NBench 0.1.0 to the symbol server (https://nuget.smbsrc.net/)...
[17:46:42][Step 1/1] Pushing D:\BuildAgent\work\2de53b95e0be9250\bin\nuget\NBench.0.1.0.nupkg Attempts left: 2
[17:46:42][Step 1/1] Pushing NBench 0.1.0 to the NuGet gallery (https://www.nuget.org)...
[17:46:43][Step 1/1] Failed to process request. 'A package with id 'NBench' and version '0.1.0' already exists and cannot be modified.'. 
[17:46:43][Step 1/1] The remote server returned an error: (409) Conflict..
[17:46:43][Step 1/1] Pushing D:\BuildAgent\work\2de53b95e0be9250\bin\nuget\NBench.0.1.0.nupkg Attempts left: 1
[17:46:44][Step 1/1] Pushing NBench 0.1.0 to the NuGet gallery (https://www.nuget.org)...
[17:46:44][Step 1/1] Failed to process request. 'A package with id 'NBench' and version '0.1.0' already exists and cannot be modified.'. 
[17:46:44][Step 1/1] The remote server returned an error: (409) Conflict..
[17:46:44][Step 1/1] Pushing D:\BuildAgent\work\2de53b95e0be9250\bin\nuget\NBench.0.1.0.nupkg Attempts left: 0
[17:46:45][Step 1/1] Pushing NBench 0.1.0 to the NuGet gallery (https://www.nuget.org)...

Corresponding code

let publishNugetPackages _ = 
    let rec publishPackage url accessKey trialsLeft packageFile =
        let tracing = enableProcessTracing
        enableProcessTracing <- false
        let args p =
            match p with
            | (pack, key, "") -> sprintf "push \"%s\" %s" pack key
            | (pack, key, url) -> sprintf "push \"%s\" %s -source %s" pack key url

        tracefn "Pushing %s Attempts left: %d" (FullName packageFile) trialsLeft
        try 
            let result = ExecProcess (fun info -> 
                    info.FileName <- nugetExe
                    info.WorkingDirectory <- (Path.GetDirectoryName (FullName packageFile))
                    info.Arguments <- args (packageFile, accessKey,url)) (System.TimeSpan.FromMinutes 1.0)
            enableProcessTracing <- tracing
            if result <> 0 then failwithf "Error during NuGet symbol push. %s %s" nugetExe (args (packageFile, "key omitted",url))
        with exn -> 
            if (trialsLeft > 0) then (publishPackage url accessKey (trialsLeft-1) packageFile)
            else raise exn
    let shouldPushNugetPackages = hasBuildParam "nugetkey"
    let shouldPushSymbolsPackages = (hasBuildParam "symbolspublishurl") && (hasBuildParam "symbolskey")

    if (shouldPushNugetPackages || shouldPushSymbolsPackages) then
        printfn "Pushing nuget packages"
        if shouldPushNugetPackages then
            let normalPackages= 
                !! (nugetDir @@ "*.nupkg") 
                -- (nugetDir @@ "*.symbols.nupkg") |> Seq.sortBy(fun x -> x.ToLower())
            for package in normalPackages do
                publishPackage (getBuildParamOrDefault "nugetpublishurl" "") (getBuildParam "nugetkey") 3 package

        if shouldPushSymbolsPackages then
            let symbolPackages= !! (nugetDir @@ "*.symbols.nupkg") |> Seq.sortBy(fun x -> x.ToLower())
            for package in symbolPackages do
                publishPackage (getBuildParam "symbolspublishurl") (getBuildParam "symbolskey") 3 package

Issue appears to be that NuGet.exe no longer returns a 0 exit code upon successful publish. What was the reasoning behind this?

forki commented 8 years ago

you could use older version of nuget.exe or paket to push (and still build package with nuget.exe 3.3).

Aaronontheweb commented 8 years ago

Yeah, we have plenty of ways to work around this issue - right now we're just downgrading the version of NuGet.exe being used back to 2.* (whatever the highest minor version number was.)

But I'd really like an explanation from the NuGet.exe team - having to revert back to an older build across 12+ OSS projects is not cool.

yishaigalatzer commented 8 years ago

@Aaronontheweb this looks like an unexpected regression. NuGet 3.X has been out for a while, and we didn't push it to nuget.org in an attempt to catch exactly this kind of errors. This should definitely be fixed as soon as possible.

emgarten commented 8 years ago

I'm seeing nuget.exe push return exit code 0 for both pushing packages and symbol packages when I try.

Tracing through your script and output here is what I think I see:

  1. nupkg is pushed to nuget.org successfully
  2. the symbol package is pushed to nuget.smbsrc.net
  3. there is no success message from nuget.exe for the symbol package, and there is no error message of Error during NuGet symbol push from the script indicating that it returned a non-zero exit code. ExecProcess appears to have a timeout of 1 minute and the timestamps show it as very close to that, but not a full minute. Did something go wrong with the symbol source? What happened here?
  4. the script loops again and tries to upload the same nupkg a 2nd time and 3rd time to nuget.org. nuget.exe correctly returns an error message since it already exists.
    • Are you able to push the symbol package manually with nuget.exe 3.3.0?
    • How consistently are you seeing your script fail?
    • What happens when ExecCommand times out?
Aaronontheweb commented 8 years ago

Good catch on the process timeout - but that doesn't appear to be occurring here. NuGet.exe is exiting on its own without being forcibly killed by our build system (see below.)

Should be noted that each independent upload gets its own timeout - that 60.0 second value isn't shared between uploading the normal package and the symbol package.

All of our push operations are running inside Azure (South Central US) on standard D12 VM instances.

Did something go wrong with the symbol source? What happened here?

No idea - we do see them throw 500 errors every now and then but that always appears in the log when it occurs.

Interestingly enough though - our nightly builds (which go to MyGet.org) have worked fine. No errors at all. Only the main nuget.org / symbsrc.net sites appear to have this issue.

Are you able to push the symbol package manually with nuget.exe 3.3.0?

I wasn't able to on Saturday when I posted this issue. Ran into the same issue consistently across 3 projects all using variations of this script (but the NuGet publish code is mostly the same.)

What happens when ExecCommand times out?

https://github.com/fsharp/FAKE/blob/7543f29800c1245a04f96b5f1be063b0568fd5d6/src/app/FakeLib/ProcessHelper.fs#L89

if timeOut = TimeSpan.MaxValue then proc.WaitForExit()
    else 
        if not <| proc.WaitForExit(int timeOut.TotalMilliseconds) then 
            try 
                proc.Kill()
            with exn -> 
                traceError 
                <| sprintf "Could not kill process %s  %s after timeout." proc.StartInfo.FileName 
                       proc.StartInfo.Arguments
            failwithf "Process %s %s timed out." proc.StartInfo.FileName proc.StartInfo.Arguments
    proc.ExitCode

We'd see a failure message that reads "Process {blah} timed out," which isn't showing up in the error log, so the process doesn't appear to be timing out.

Aaronontheweb commented 8 years ago

Here's a MyGet readout for this project: http://petabridge-ci.cloudapp.net/viewLog.html?buildId=3219&buildTypeId=NBench_NightlyBuilds&tab=buildLog#_focus=38&state=38&guest=1

Versus the NuGet readout, which appears to be working today: http://petabridge-ci.cloudapp.net/viewLog.html?buildId=3363&buildTypeId=NBench_ReleaseBuilds&tab=buildLog#_state=38&focus=2023&guest=1

BTW - correction: symbol and nuget.org pushing do occur simultaneously by default. Forgot that NuGet.exe does that.

Aaronontheweb commented 8 years ago

So is it possible that the issue here is that smbsrc.net was failing without returning an error code? FAKE wasn't killing the process, but we weren't pushing the symbol package either.

yishaigalatzer commented 8 years ago

From reading the script, looks like the process pushes the package to symbol server, but never throws out any error just silently exists. The script then starts another round of pushing (which it shouldn't because the first operation succeeded).

What happened to the message after the first push to symbol server?

Aaronontheweb commented 8 years ago

@yishaigalatzer here's the full build log from where the error occurred

http://petabridge-ci.cloudapp.net/viewLog.html?buildId=3182&buildTypeId=NBench_ReleaseBuilds&tab=buildLog#_focus=2028&guest=1

Looks like we're not getting anything back from the symbol server and there's a silent failure. FAKE doesn't log any error messages by default if the ExecProcess call doesn't end with a non-zero error code. That's the responsibility of the user writing the script, which we do with the retry-wrap.

Aaronontheweb commented 8 years ago

Should be noted, if you take a look at the artifacts, these are not particularly large packages. Largest symbol package is about ~140kb.

http://petabridge-ci.cloudapp.net/viewLog.html?buildId=3182&buildTypeId=NBench_ReleaseBuilds&tab=artifacts&guest=1

We had consistent issues with this between last Thursday, 12/3, and Saturday, 12/5 - do you have any logs indicating that there were intermittent issues with the symbol servers?

yishaigalatzer commented 8 years ago

@yishaigalatzer here's the full build log from where the error occurred

http://petabridge-ci.cloudapp.net/viewLog.html?buildId=3182&buildTypeId=NBench_ReleaseBuilds&tab=buildLog#_focus=2028&guest=1

I can't get to it for some reason.

Looks like we're not getting anything back from the symbol server and there's a silent failure. FAKE doesn't log any error messages by default if the ExecProcess call doesn't end with a non-zero error code. That's the responsibility of the user writing the script, which we do with the retry-wrap.

The sounds right, the issue is that instead of just retrying pushing the symbol package, the script tries to push to nuget.org again. It could be interesting to try and fix this part (not to fail silently, we need to find a repro to figure out exactly how it failed)

We had consistent issues with this between last Thursday, 12/3, and Saturday, 12/5 - do you have any logs indicating that there were intermittent issues with the symbol servers?

We do not own symbol source, but @xavierdecoster might know how to find out

Aaronontheweb commented 8 years ago

Uploaded the log as an attachment

NBench_Release_Builds_10.txt

yishaigalatzer commented 8 years ago

@Aaronontheweb thanks. I'm heading out for the night, I (or Justin) will pick it up again in the morning. In the meantime I would modify the script to run each command on its own (if you agree with my suggestion of course)

Aaronontheweb commented 8 years ago

@yishaigalatzer I just added NBench to the guest account for our TeamCity server, so all of those links should be viewable now.

That's a good suggestion - we'll see about doing that.

xavierdecoster commented 8 years ago

@Aaronontheweb any reason you are pushing the regular nupkg to symbolsource instead of the symbols nupkg? It looks like you have it in your build artifacts (NBench.0.1.0.symbols.nupkg), but the build log indicates you are pushing NBench.0.1.0.nupkg instead.

We had consistent issues with this between last Thursday, 12/3, and Saturday, 12/5 - do you have any logs indicating that there were intermittent issues with the symbol servers?

@TripleEmcoder may provide details on server-side error handling as he owns SymbolSource and has access to the logs

Aaronontheweb commented 8 years ago

@xavierdecoster

The normal package is going to NuGet.org, and the Symbols package is going to SymbolSource:

[17:45:42][Step 1/1] Pushing NBench 0.1.0 to the NuGet gallery (https://www.nuget.org)...
[17:45:47][Step 1/1] Your package was pushed.
[17:45:47][Step 1/1] Pushing NBench 0.1.0 to the symbol server (https://nuget.smbsrc.net/)...
[17:46:42][Step 1/1] Pushing D:\BuildAgent\work\2de53b95e0be9250\bin\nuget\NBench.0.1.0.nupkg Attempts left: 2
[17:46:42][Step 1/1] Pushing NBench 0.1.0 to the NuGet gallery (https://www.nuget.org)...
[17:46:43][Step 1/1] Failed to process request. 'A package with id 'NBench' and version '0.1.0' already exists and cannot be modified.'. 
yishaigalatzer commented 8 years ago

[17:45:47][Step 1/1] Pushing NBench 0.1.0 to the symbol server (https://nuget.smbsrc.net/)... I think @xavierdecoster is referring to this line, it just reads a bit confusing IMHO.

@Aaronontheweb do you still believe the original issue opened here is valid? Or should we just try and repro the silent (no error message, but correct exit code) for failing to push a symbol package?

Aaronontheweb commented 8 years ago

@yishaigalatzer it looks to me like a silent error as a result of an intermittent error when pushing a symbol package. You can close this issue.