buildkite / agent

The Buildkite Agent is an open-source toolkit written in Go for securely running build jobs on any device or network
https://buildkite.com/
MIT License
810 stars 300 forks source link

Artifact download doesn’t use most recent artifact when names collide #803

Open jmendiara opened 6 years ago

jmendiara commented 6 years ago

Take the following scenario

$ ls logs
logfile.txt
steps: 
  - command: exit 0
    artifact_paths:
      - "logs/*"
      - "logs/*"

logfile.txt, is uploaded twice by the agent and It's stored twice in the backend: Downloading with buildkite-agent artifact download logs/logfile.txt logs/ says

Failed to download artifacts: GET https://agent.buildkite.com/v3/builds/xxx/artifacts/search?query=logs%2Flogfile.txtl: 
 400 Multiple artifacts were found for query: `logs/logfile.txt`. Try scoping by the job ID or name.

Is this the correct behaviour? Should't filter the agent dup files when expanding paths?

Also, i think (not sure) executing this in the command for the same job may result in 3 files with the same sha

buildkite-agent artifact upload "logs/logfile.txt"

Should backend dedup files with the same sha for the same job?

lox commented 6 years ago

We'll investigate!

donparapidos commented 5 years ago

Hello,

Hows with investigation on this one ? I am often facing issue as described:

buildkite-agent artifact upload application-$BUILDKITE_BUILD_NUMBER.nomad

in the next step I use artifact plugin to download application-$BUILDKITE_BUILD_NUMBER.nomad and get:


Downloading artifacts | 0s
-- | --
  | 2019-10-06 15:22:43 INFO   Searching for artifacts: "application-68.nomad"
  | 2019-10-06 15:22:43 FATAL  Failed to download artifacts: GET https://agent.buildkite.com/v3/builds/xxxxxxx/artifacts/search?query=application-68.nomad: 400 Multiple artifacts were found for query: `application-68.nomad`. Try scoping by the job ID or name.
O1ahmad commented 4 years ago

+1 Also experiencing the same in which a generated pipeline uploads multiple instances of an artifact (uploaded with identical names/paths) resulting in subsequent downloads of this artifact name during downstream steps to fail on a (400 Multiple artifacts were found for query) error:

2020-08-14 17:31:10 FATAL  Failed to download artifacts: GET https://agent.buildkite.com/v3/builds/562798f2-c9ca-4cea-9449-4c8485bf1d77/artifacts/search?query=DOCKER_DEPLOY_ENV: 400 Multiple artifacts were found for query: `DOCKER_DEPLOY_ENV`. Try scoping by the job ID or name.
--
  | 🚨 Error: The command exited with status 1
pda commented 4 years ago

1268 fixes the case where a single buildkite-agent artifact upload would upload the same file twice if it's matched by multiple paths/globs. That includes the array version of artifact_paths in YAML pipelines.

I suspect some people in this issue are experiencing the issue due to separate uploads using the same path, perhaps from different steps within the same build. In those cases, I'd suggest ensuring a path is only uploaded once in the build. You might want to incorporate BUILDKITE_JOB_ID (a server-generated UUID) in the filename? Admittedly that will make it harder to find afterwards.

lk86 commented 4 years ago

@0x0I and I discovered this issue largely because the buildkite-agent artifact download does not work according to the doumentation, and #1268 does not resolve this issue. Quote from the documentation:

"The buildkite-agent artifact command will find the most recent file uploaded with a matching filename, no matter which build step uploaded it. If you want to target an artifact from a particular build step use the --step argument."

As anyone in this thread has experienced, it does not download the most recent file, it only downloads unique files uploaded by a pipeline. If this is not the intended behavior we can certainly work around the --step solution but the documentation remains incredibly misleading.

pda commented 4 years ago

Oh interesting — I hadn't realised the docs said that. I'll reopen the issue.