dotnet / arcade

Tools that provide common build infrastructure for multiple .NET Foundation projects.
MIT License
657 stars 331 forks source link

Request failed "500 Internal Server Error" #11737

Open ulisesh opened 1 year ago

ulisesh commented 1 year ago

Build

https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=93516

Build leg reported

Test: Blazor E2E tests on Linux / Run E2E tests

Pull Request

https://github.com/dotnet/aspnetcore/pull/44834

Action required for the engineering services team

To triage this issue (First Responder / @dotnet/dnceng):

If this is an issue that is causing build breaks across multiple builds and would get benefit from being listed on the build analysis check, follow the next steps:

  1. Add the label "Known Build Error"
  2. Edit this issue and add an error string in the Json below that can help us match this issue with future build breaks. You should use the known issues documentation
 {
    "ErrorPattern" : "An unexpected error occurred: \"https://pkgs.dev.azure.com/dnceng/public/_packaging/.*: Request failed \\\\\"500 Internal Server Error\\\\\"",
    "BuildRetry": true
 }

Release Note Category

Additional information about the issue reported

No response

Report

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 0
ChadNedzlek commented 1 year ago

Could we potentially include the "https://pkgs.dev.azure.com/dnceng/public/_packaging" part of the message in the error string? Seems like we'd want to sort "AzDO package feeds are failing" from other random 500's that might occur during a build.

ulisesh commented 1 year ago

@ChadNedzlek done

ulisesh commented 1 year ago

Build retry seems to be helping but we keep getting a couple of hits every day

ulisesh commented 1 year ago

Unfortunately, we keep seeing some hits every day. FR should investigate more, we might need to create an IcM to get some help from AzDo

ulisesh commented 1 year ago

It is interesting to me that the only hits we see come from the aspnetcore repo

MattGal commented 1 year ago

sure I'll create a fresh IcM asking for investigation

MattGal commented 1 year ago

It is interesting to me that the only hits we see come from the aspnetcore repo

The reason this is specific to AspNet is likely that most .NET Core repos do not use NPM to this extent, so have no/ far fewer chances to get NPM problems.

Created https://portal.microsofticm.com/imp/v3/incidents/details/359098770 to ask for an investigation.

MattGal commented 1 year ago

Replied to requests in the IcM; I left step-by-step instructions how to get precise timestamps of what failed and such, hopefully they actually believe us now.

MattGal commented 1 year ago

Pinged the IcM, no replies since updating it. Tossing this issue into tracking.

MattGal commented 1 year ago

Still reproing, IcM ticket is just claiming they don't have telemetry for the problem still. Added repro from last night to the IcM and pinged the ticket.

MattGal commented 1 year ago

With 0 hits in the last 7 days, I am closing the issue on our side.

dougbu commented 1 year ago

Reopening for the dotnet-public-npm feed errors we're seeing in dotnet/aspnetcore (a lot). For example,

  error An unexpected error occurred: "https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-public-npm/npm/registry/istanbul-lib-report/-/istanbul-lib-report-3.0.0.tgz: Request failed \"500 Internal Server Error\"".
  info If you think this is a bug, please open a bug report with the information provided in "/Users/runner/work/1/s/src/SignalR/clients/ts/common/yarn-error.log".
  info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.
  Process stalled
  Active handles:
    - Socket
    - Socket
    - Socket
    - TLSSocket
    - TLSSocket
    - TLSSocket
    - TLSSocket
    - TLSSocket
/Users/runner/work/1/s/eng/targets/Npm.Common.targets(45,5): error MSB6006: "yarn" exited with code 1. [/Users/runner/work/1/s/src/SignalR/clients/ts/common/common.npmproj]
##[error]eng/targets/Npm.Common.targets(45,5): error MSB6006: (NETCORE_ENGINEERING_TELEMETRY=Restore) "yarn" exited with code 1.

Affected aspnetcore-ci rolling builds over the last week:

dougbu commented 1 year ago

Will those rolling builds be included in the tracking information automatically❔ Or, did I just set things up to track future failures (again)❔

MattGal commented 1 year ago

Any build that matches the error string should trigger it. We may need to edit the original post and tweak the string; I've asked Ulises to take a peek and see if there's a reason it missed these.

ulisesh commented 1 year ago

I looked at #20230211.3 and couldn't figure out why build analysis couldn't find a match with this issue but I found a rolling build where the right things happen. I'll continue investigating

https://dev.azure.com/dnceng-public/public/_build/results?buildId=169375&view=logs&j=366a1024-3a8a-5d08-6c00-a4b17dce0d38&t=7c220d1f-55ee-5c39-b8a6-7373cb0d28ee&s=6884a131-87da-5381-61f3-d7acc3b91d76

MattGal commented 1 year ago

This is still happening and I am being told that the linked IcM is the same as https://portal.microsofticm.com/imp/v3/incidents/details/353857134/home, one about NuPKG. This is frustrating and I believe inaccurate, but not something I can do about. I will assign this issue to @ilyas1974 to keep it moving.

markwilkie commented 1 year ago

@AlitzelMendez - any chance you could take a peek and see if the known issue is catching this now?

MattGal commented 1 year ago

@AlitzelMendez - any chance you could take a peek and see if the known issue is catching this now?

If you just click the top link on the issue it shows an instance happening 1 hour ago so I'm pretty sure the system is working.

The AzDO packaging team is evidently aware of the problem and has some ideas for folks using the version of Yarn ASP.NET is using and prepped a guide; @dougbu FYI

dougbu commented 1 year ago

I brought this up in our ASP.NET Build teams channel. @wtgodbe is going to try adding retries to our yarn commands. It's a bit complicated because we use the Yarn.MSBuild package and that wraps commands.

Our efforts to move to npm aren't really moving yet but are planned for this year (.NET 8 timeframe).