dotnet / dnceng

.NET Engineering Services
MIT License
24 stars 18 forks source link

Missing FORCE_QUEUE requirements should not fail the build before images are generated #793

Open riarenas opened 1 year ago

riarenas commented 1 year ago

If a PR is missing enough FORCE_QUEUE statements to make the system happy, it should still generate the images that were requested. It's fine to still fail the build, but at least let's not waste a whole costly and lengthy iteration attempt.

Release Note Category

riarenas commented 1 year ago

If we still want to fail fast when the requirements are not met, FORCE_QUEUE requirements should probably be similar to the release note requirement.

riarenas commented 10 months ago

Moving to helix-machines improvements epic. This is directly tied to that effort's business goals.

dougbu commented 10 months ago

some background here. the requirements changed a bit as I worked on https://github.com/dotnet/arcade/issues/13212 and its aftermath (#279, #295, and #319). during that work, we had a few conversations about the word "improve" but didn't end up agreeing on a complete revamp. some of the requirements, especially the basics like FORCE_QUEUE lines in commit descriptions haven't changed in ages and ages. on the other hand, more people were thinking about the requirements because either they got drawn in to the conversations or they needed to make helix-machines changes for the first time (or first time in a while).

this brings us to today. the current requirements are

  1. FORCE_QUEUE lines must be in a commit description (the logic automatically ignores commits in the target branch)
  2. PRs must force a minimum of two real and unique image generations
    1. except when validation builds won't run i.e., the only changes are to files excluded in our YAML (tools/ and children today) or the target branch's policy (which requires PRs for all but *; !/base-images/*; !*.md for main)
    2. "real" in this context means a queue exists with the given name. this is rather difficult to check outside the CreateCustomImage program. but, we could perhaps add a new option that only checks the names exist
    3. "unique" in this context means two named queues in the FORCE_QUEUE collection don't use the same custom image i.e., we'll reject FORCE_QUEUE Windows.10.Amd64 and FORCE_QUEUE Windows.10.Amd64.Open in the same PR. of course, this is also difficult to check outside CreateCustomImage

but, hey, at least case is ignored in queue names now :wink:


that said, I get it. Build C# and Test C# are just going to waste time if the pipeline will fail as soon as it hits Build Images. at least making the issue visible earlier for those who view in-progress builds — as currently happens w/ missing release notes — seems doable.

It's very important that the configuration for force_queues can be changed without pushing a new commit and restarting the entire build.

I'm less sure about this. first, the PR description is painful to read using the AzDO API from what I've seen (though I'm probably wrong). any, other suggestions beyond "use tags" (which I don't think is viable given many other potential uses for PR tags), "leave things in commit descriptions" and "move to PR descriptions"❔

/fyi my objection to "move to PR descriptions" was

the problem here is that the PR description becomes the default text of the merged commit. that adds useless information to the 'main' and production histories.

@riarenas thought including the information could be useful in the branch history and I agree that can happen. feels to me like the exception rather than the rule

dougbu commented 10 months ago

oh, I forgot: changing the PR description to add FORCE_QUEUE lines might not make the second attempt that much easier. if we have a separate fail fast stage and you catch things before Build Images starts, you're all set. however the usual case might be that you see the problem later and now need to retry failed jobs in two stages or wait 'til the full build ends (fails). it saves the Build C# and Test C# work (20 to 25 minutes lately) but could get annoying,