actions / download-artifact

MIT License
1.44k stars 498 forks source link

[bug] only 600 of 1243 artifacts are downloaded #363

Open kagahd opened 1 week ago

kagahd commented 1 week ago

What happened?

We have to run hundreds of workflow jobs concurrently (strategy matrix is limited to max-parallel: 10 though). Each job produces one to four artifacts. All these artifacts (around 1200) are then collected and aggregated to a final artifact that is uploaded. The problem is, that not all 1200 artifacts are downloaded, so we are missing data to be aggregated. We took notice of the following warning:

2024-11-16T08:23:26.9639438Z ##[warning]Workflow run 11844124814 has more than 1000 artifacts. Results will be incomplete as only the first 1000 artifacts will be returned

Nevertheless, only 600 artifacts were found as we could also read in the log file:

2024-11-16T08:23:26.9808235Z ##[debug]Fetching page 2 of artifact list
2024-11-16T08:23:27.0948608Z ##[debug]Fetching page 4 of artifact list
2024-11-16T08:23:27.2910549Z ##[debug]Fetching page 6 of artifact list
2024-11-16T08:23:27.4662867Z ##[debug]Fetching page 8 of artifact list
2024-11-16T08:23:27.6195180Z ##[debug]Fetching page 10 of artifact list
2024-11-16T08:23:27.7675981Z Found 600 artifact(s)

Therefore, only 600 artifacts were downloaded:

2024-11-16T08:23:27.7678410Z ##[debug]Found 600 artifacts in run
2024-11-16T08:23:27.7681603Z No input name or pattern filtered specified, downloading all artifacts
2024-11-16T08:23:27.7682882Z Preparing to download the following artifacts:
...[600 further lines, one for each downloaded artifact]...

However, the exact number of artifacts was 1243, returned by the following command:

curl -L -H "Accept: application/vnd.github+json" -H "Authorization: Bearer $TOKEN"   https://api.github.com/repos/company/my-repo/actions/runs/11844124814/artifacts

What did you expect to happen?

I expected that the action download-artifact@v4 will download all artifacts since I did not specify the name of a specific artifact. The documentation states for the name input parameter:

If unspecified, all artifacts for the run are downloaded.

Obviously, that's not true. At the very least, it should be documented that a maximum of 1000 artifacts, perhaps only 600, are downloaded.

When I discovered the following log message, I expected to have at least 1000 downloads.

2024-11-16T08:23:26.9639438Z ##[warning]Workflow run 11844124814 has more than 1000 artifacts. Results will be incomplete as only the first 1000 artifacts will be returned

But even this is not true because action download-artifact@v4 found and downloaded only 600 artifacts.

How can we reproduce it?

You can reproduce it by creating a workflow with a matrix job that uploads 1200 artifacts. A subsequent job should try to download all artifacts by using the action actions/download-artifact@v4.

Anything else we need to know?

I also tried to use the input parameter pattern: "*report-v4-DEV" to reduce the number of artifacts to be downloaded at once but this also failed because only 103 of a total of 250 artifacts matching the pattern were downloaded.

Warning: Workflow run 11844124814 has more than 1000 artifacts. Results will be incomplete as only the first 1000 artifacts will be returned
##[debug]Fetching page 2 of artifact list
##[debug]Fetching page 4 of artifact list
##[debug]Fetching page 6 of artifact list
##[debug]Fetching page 8 of artifact list
##[debug]Fetching page 10 of artifact list
Found 600 artifact(s)
##[debug]Found 600 artifacts in run
Filtering artifacts by pattern '*report-v4-DEV'
##[debug]Filtered from 600 to 103 artifacts
Preparing to download the following artifacts:
...[103 further lines, one for each downloaded artifact]...

What version of the action are you using?

v4

What are your runner environments?

linux

Are you on GitHub Enterprise Server? If so, what version?

No response