Closed dkurepa closed 2 years ago
I spent some time figuring this one out.
I saw there were duplicate tags in the linked build:
which made me take a look at the default channels for the repo:
darc get-default-channels --source-repo https://dev.azure.com/dnceng/internal/_git/dotnet-winforms-designer --channel ".NET
Core Tooling Release"
The result showed that there were both a regex match for any release branch, and a direct default channel for the 17.3p2 branch which ended up being expanded identically:
(4263) https://dev.azure.com/dnceng/internal/_git/dotnet-winforms-designer @ release/17.3p2 -> .NET Core Tooling Release
(4219) https://dev.azure.com/dnceng/internal/_git/dotnet-winforms-designer @ -regex:release/* -> .NET Core Tooling Release
After removing the regex entry, a retry of the build was able to publish.
The error stack for the publishing build shows there's a bug in our Equals implementation for TargetFeedConfig that we should fix so that we don't fail builds that may have duplicate channel entries.
D:\a\_work\1\s\.packages\microsoft.dotnet.arcade.sdk\7.0.0-beta.22276.1\tools\SdkTasks\PublishArtifactsInManifest.proj(144,5): error : Object reference not set to an instance of an object.
D:\a\_work\1\s\.packages\microsoft.dotnet.arcade.sdk\7.0.0-beta.22276.1\tools\SdkTasks\PublishArtifactsInManifest.proj(144,5): error : at Microsoft.DotNet.Build.Tasks.Feed.Model.TargetFeedConfig.Equals(Object obj) in /_/src/Microsoft.DotNet.Build.Tasks.Feed/src/model/TargetFeedConfig.cs:line 94
D:\a\_work\1\s\.packages\microsoft.dotnet.arcade.sdk\7.0.0-beta.22276.1\tools\SdkTasks\PublishArtifactsInManifest.proj(144,5): error : at System.Collections.Generic.HashSet`1.AddIfNotPresent(T value, Int32& location)
D:\a\_work\1\s\.packages\microsoft.dotnet.arcade.sdk\7.0.0-beta.22276.1\tools\SdkTasks\PublishArtifactsInManifest.proj(144,5): error : at Microsoft.DotNet.Build.Tasks.Feed.PublishArtifactsInManifestV3.ExecuteAsync() in /_/src/Microsoft.DotNet.Build.Tasks.Feed/src/PublishArtifactsInManifestV3.cs:line 170
##[error].packages\microsoft.dotnet.arcade.sdk\7.0.0-beta.22276.1\tools\SdkTasks\PublishArtifactsInManifest.proj(144,5): error : Object reference not set to an instance of an object.
at Microsoft.DotNet.Build.Tasks.Feed.Model.TargetFeedConfig.Equals(Object obj) in /_/src/Microsoft.DotNet.Build.Tasks.Feed/src/model/TargetFeedConfig.cs:line 94
at System.Collections.Generic.HashSet`1.AddIfNotPresent(T value, Int32& location)
at Microsoft.DotNet.Build.Tasks.Feed.PublishArtifactsInManifestV3.ExecuteAsync() in /_/src/Microsoft.DotNet.Build.Tasks.Feed/src/PublishArtifactsInManifestV3.cs:line 170
yay, the regex feature still works :)
I feel like there are two issues here to address:
1) Darc, when there are multiple default channels for a branch and they are the same channel, tries to publish to the same channel twice (should there be a Distinct() here: https://github.com/dotnet/arcade-services/blob/fc238be0112abfd0f8b895b476ce7410aba743c8/src/Microsoft.DotNet.Darc/src/Darc/Operations/AddBuildToChannelOperation.cs#L113-L116) 2) Publishing, when presented with the same channel twice gets.... confused. This can only happen in the case in # 1 (I think? I honestly am pretty confused by the error message/failure case and how to test it out to repro). So if we solve the former, we get the later for free. Of course, if this comes up just by virtue of the fact that a branch has the same default channel twice as a default channel, then that's something that definitely needs to be addressed.
Oh, and we should probably distinct targetChannelsIds here: https://github.com/dotnet/arcade/blob/main/src/Microsoft.DotNet.Build.Tasks.Feed/src/PublishArtifactsInManifestV3.cs#L97
Still.... not sure why there was a nullref in the Equals method? That's super weird
That Equals method is rather large, and there are a lot of unguarded dereferences. Probably one of the strings is null and we end up with null.Equals(<stuff>)
add null checks TO EVERYTHING
First PR (to distinct in add-build-to-channel): https://github.com/dotnet/arcade-services/pull/1922
@adiaaida this has been in rollout some time, can it be closed?
oh yeah. definitely. well. I think. @riarenas did we end up having to roll back one of the changes I made to darc for this?
We haven't reverted this, but we're still using old darc everywhere while we triage https://github.com/dotnet/arcade/issues/9948
A customer reported a build failing due to a darc error. It looks like it's an issue on our end. https://dev.azure.com/dnceng/internal/_build/results?buildId=1795967&view=logs&j=ba23343f-f710-5af9-782d-5bd26b102304&t=c7a8693b-2f9c-5ea8-c909-cde9405ac2e1&l=315
Original conversation link