octokit / request-action

A GitHub Action to send arbitrary requests to GitHub's REST API
https://github.com/marketplace/actions/GitHub-API-Request
MIT License
368 stars 48 forks source link

Auto-paginated through responses spread over multiple pages #181

Open dhirschfeld opened 2 years ago

dhirschfeld commented 2 years ago

I'm trying to get tags associated with container images by hitting the below endpoint:

  route: GET /orgs/{org}/packages/container/{container}/versions

When using this action the returned data appears to be truncated to 30 items!?!

Using the gh cli correctly returns all image ids/tags:

data=$(gh api -X GET --paginate "/orgs/${org}/packages/container/${container}/versions")
echo $data | jq 'map({(.id|tostring):  .metadata.container.tags}) | add'

Am I missing something obvious (e.g. a paginate argument) or is this a bug?

action.yml ```yaml name: Get Image Tags description: 'Returns the tags associated with a container image.' inputs: repo: required: true type: string container: required: true type: string ghcr_token: required: true outputs: images: description: "A mapping of image ids to image tags." value: ${{ steps.set_output.outputs.images }} runs: using: "composite" steps: - id: parse_repo shell: bash run: | set -euox pipefail IFS='/' read owner repo <<< ${{ inputs.repo }} echo "::set-output name=org::${owner}" echo "::set-output name=container::${repo,,}/${{ inputs.container }}" - id: get_images # https://github.com/octokit/request-action uses: octokit/request-action@v2.x env: GITHUB_TOKEN: ${{ inputs.ghcr_token }} with: org: ${{ steps.parse_repo.outputs.org }} container: ${{ steps.parse_repo.outputs.container }} route: GET /orgs/{org}/packages/container/{container}/versions - id: set_output shell: bash run: | set -euox pipefail images=$(echo -e ${{ toJSON(steps.get_images.outputs.data) }} | jq -c 'map({(.id|tostring): .metadata.container.tags}) | add') echo "::set-output name=images::${images}" ```
dhirschfeld commented 2 years ago

Yup, looks like a pagination issue. In the debug logs I see this in the headers:

"link":
    <https://api.github.com/organizations/<ORG>/packages/container/<CONTAINER>/versions?page=2&per_page=30>; rel="next",
    <https://api.github.com/organizations/<ORG>/packages/container/<CONTAINER>/versions?page=2&per_page=30>; rel="last"
dhirschfeld commented 2 years ago

So, the question is how to return all results, similarly to the --paginate argument in the gh cli?

dhirschfeld commented 2 years ago

Ok, changing the UR to the below works:

  route: GET /orgs/{org}/packages/container/{container}/versions?per_page=100

...but this will break if you have >100 image versions šŸ¤¢

That's just a hack - it doesn't fix the issue, it just pushes the goalposts a bit further out.

If there really is no way to specify to return all results (similar to the CLI) I guess this issue can be turned into a feature request for that functionality.

gr2m commented 2 years ago

Sorry pagination is not supported by this action. It's a supper thin wrapper to send requests to GitHub's REST API. I'd suggest to use an alternative action in this case.

dhirschfeld commented 2 years ago

Silent truncation of data is a pretty user-hostile default which I think deserves a big warning in the README.

Warning This action only returns the first page of data, which by default is only the first 30 results. If you need more than 30 results you can append the ?per_page=100 query to your url. If you need more than 100 results you will need to use another tool to request all of the data.

gr2m commented 2 years ago

The action is a tool to utilize GitHub's REST API, the REST API documentation should be clear enough about the pagination limitation.

I do agree however that the description at https://docs.github.com/en/rest/packages#get-all-package-versions-for-a-package-owned-by-an-organization is misleading.

Returns all package versions for a package owned by an organization

Should be reworded to

Lists package versions for a package owned by an organization

That would be coherent with other List ... endpoints. It would be great if you could file an issue with the docs team

timrogers commented 2 years ago

The action is a tool to utilize GitHub's REST API, the REST API documentation should be clear enough about the pagination limitation.

I do agree however that the description at https://docs.github.com/en/rest/packages#get-all-package-versions-for-a-package-owned-by-an-organization is misleading.

Returns all package versions for a package owned by an organization

Should be reworded to

Lists package versions for a package owned by an organization

That would be coherent with other List ... endpoints. It would be great if you could file an issue with the docs team

I've created a PR internally at GitHub to fix this. Good spot!

timrogers commented 2 years ago

@gr2m Do you think it would be reasonable to add support for auto-pagination in this action? We do have the pagination plugin for Octokit, so it doesn't seem like a huge over-reach. We would have to think about whether we should paginate by default.

dhirschfeld commented 2 years ago

Do you think it would be reasonable to add a support for auto-pagination in this action?

I very much think it will be worthwhile. Even though my use-case isn't mission-critical, I still plan to move to a solution which lets me get all of the data - hopefully I won't need to!

It's all well and good targeting a *"thin wrapper", however the REST API implements a mechanism to get all of your data - just follow the links recursively. gh api is also a thin wrapper and it has implemented the pagination in a way that makes sense for that tool. Without providing a mechanism to implement pagination in this tool it's only implementing subset of the functionality.

For me, a tool designed to get data which by default will silently truncate your data and provides no mechanism to actually get all of your data doesn't make much sense.

timrogers commented 1 year ago

I think this would be a reasonable thing to add, but I don't think I'm going to have time to add it myself. I'll reopen this issue so the request is tracked.

gr2m commented 1 year ago

@gr2m Do you think it would be reasonable to add support for auto-pagination in this action

I would prefer we do not add auto-pagination.

There are tons of other features we could implement that is part of Octokit but not of this action. Before going down this path I'd suggest we update the README and link to workarounds if pagination is needed.

  1. You could create a dedicated action that does the pagination and link to that action
  2. We could link to https://github.com/actions/github-script