Azure / azure-dev

A developer CLI that reduces the time it takes for you to get started on Azure. The Azure Developer CLI (azd) provides a set of developer-friendly commands that map to key stages in your workflow - code, build, deploy, monitor, repeat.
https://aka.ms/azd
MIT License
412 stars 202 forks source link

engsys: idempotent publishing for re-runs #4361

Open weikanglim opened 1 month ago

weikanglim commented 1 month ago

Is there a general strategy we can make artifact publishing idempotent to job re-runs?

Example:

In publish-cli.yml (and anywhere else), when we have a pipeline artifact being published:

  - task: 1ES.PublishPipelineArtifact@1
    inputs: 
      targetPath: release
      artifact: UploadedReleaseArtifacts

And in a subsequent step such as create-pull-request, fails due to a flaky issue. On a rerun, we will get an error:

[error]Artifact UploadedReleaseArtifacts already exists for build 4153919.

Here's a recent build run demonstrating this issue.

danieljurek commented 1 month ago

In this particular case the assumption was made that the GitHub token would not be rate limited. A recent test of some automation broke this assumption but that should not be the case going forward.

A few ways to solve this would be to:

  1. Use a token that's less likely to be subject to rate limits
  2. Further subdivide jobs so that artifact uploads and tasks that may depend on those artifacts are separated and the dependent jobs can be retried
  3. Remove some artifact publishing from publish-cli.yml in some cases because it's not strictly necessary (we shouldn't do this)
  4. Skip the artifact publish task on failure (assume that previous run produced the right artifact and uploaded it)

I'm closing this since it was caused by a one-off event but if it happens again we'd probably start with ensuring that automation tests use a different token.

weikanglim commented 1 month ago

@danieljurek Is there a property for re-uploading in-place? I feel like this would be acceptable for a non-production artifact.

Btw: I hit this twice again today. Here's another instance: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=4172035&view=results

Another tactical thing: can we shift the PR add comment up? (haven't looked at the implementation here)

danieljurek commented 1 month ago

Reopen! This is still a problem let's do something about it.