Closed kapilt closed 1 year ago
Thanks for generating this list. We need to do a little more auditing on these. Another important attribute for operations that we can support pagination for is if the operation has a list in the output (and preferably only one list). Sometimes APIs will paginate over maps or have multiple lists and this can be problematic for us as we aggregate the output on behalf of the user (this is unique to boto3/CLI in comparison to other SDKs). If anybody wants to start taking a crack at these please go ahead, ideally each service as a separate PR.
any ETA on when these paginators will be available?
So that there's one place to look for all of these, here's a list of other PRs I've found that add pagination definitions -
I’ve done one off contribs on paginators before. I suggest a holistic approach geared towards coverage, ie default generators based on this heuristic, followed with api method specific customization as needed.
I went ahead and extended this snippet to enable generation of paginators taking into the account additional constraints with regard to merged output fields, if I send a pr per service its on the order of 60+.. is that preferable to an individual pr?
I just updated #1548 to fix a test failure. I agree that handling this at an individual service level seems prone to failure. It'd be great to have the paginator config auto-generated from the service JSON instead of being manually updated (or not, as the case may be).
@kapilt I have a script that generates paginators, constrained heavily to only generate those that I can be absolutely certain about. I just merged a PR stemming from that that added a ton. What we really need to do is to have that be part of our release automation. Right now we're relying pretty much entirely on getting them from upstream.
@JordonPhillips thanks, one thing missing from the pr was the script used to generate afaics re https://github.com/boto/botocore/pull/1633 which ideally would be part of the source tree as well (scripts dir perhaps)
The script doesn't identify methods where the token is not named NextToken
as can happen. For example cognito-idp uses the parameter NextToken
for list_users_in_group
, but uses PaginationToken
for list_users
. I'm not sure how many other services use different naming conventions for the name of the token used for pagination.
Looks like this is still an issue:
client = boto3.client('cloudwatch', region_name='us-east-1')
paginator = client.get_paginator('get_metric_data')
Traceback (most recent call last):
...
paginator = client.get_paginator('get_metric_data')
File ".../env/lib/python3.6/site-packages/botocore/client.py", line 387, in get_paginator
if not self.can_paginate(operation_name):
File "/usr/local/google/home/cohenjon/Source/outline-electron-metrics/env/lib/python3.6/site-packages/botocore/client.py", line 420, in can_paginate
actual_operation_name = self._PY_TO_OP_NAME[operation_name]
KeyError: 'get_metric_data'
@JonathanDCohen I suspect you're on an out-of-date version of boto3. Have you tried updating? I'm on a version from March and it works fine.
>>> import boto3
>>> client = boto3.client('cloudwatch')
>>> client.get_paginator('get_metric_data')
<botocore.client.CloudWatch.Paginator.GetMetricData object at 0x7f7e9e3bdfd0>
>>> boto3.__version__
'1.9.122'
@brandond yeah that was it. All figured out :)
The kinesis.list_shards
paginator has issues; the API interface is problematic in that it wants StreamName
to be set on the first call, but not set when NextToken
is passed. (ugh.)
Python 3.7.5 (default, Nov 1 2019, 02:16:23)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.10.2 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import boto3.session
In [2]: boto3.__version__
Out[2]: '1.10.43'
In [3]: kinesis = boto3.session.Session(profile_name="xxxxxxxx").client("kinesis")
In [4]: paginator = kinesis.get_paginator("list_shards").paginate(StreamName="my-stream-name", PaginationConfig={"PageSize": 2})
In [5]: for page in paginator:
...: print(page)
...:
{'Shards': [...], 'NextToken': '...', 'ResponseMetadata': {...}
---------------------------------------------------------------------------
InvalidArgumentException Traceback (most recent call last)
<ipython-input-5-d3d91b85e827> in <module>
----> 1 for page in paginator:
2 print(page)
3
~/projects/mars-ntr-load-test/venv/lib/python3.7/site-packages/botocore/paginate.py in __iter__(self)
253 self._inject_starting_params(current_kwargs)
254 while True:
--> 255 response = self._make_request(current_kwargs)
256 parsed = self._extract_parsed_response(response)
257 if first_request:
~/projects/mars-ntr-load-test/venv/lib/python3.7/site-packages/botocore/paginate.py in _make_request(self, current_kwargs)
330
331 def _make_request(self, current_kwargs):
--> 332 return self._method(**current_kwargs)
333
334 def _extract_parsed_response(self, response):
~/projects/mars-ntr-load-test/venv/lib/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
274 "%s() only accepts keyword arguments." % py_operation_name)
275 # The "self" in this scope is referring to the BaseClient.
--> 276 return self._make_api_call(operation_name, kwargs)
277
278 _api_call.__name__ = str(py_operation_name)
~/projects/mars-ntr-load-test/venv/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
584 error_code = parsed_response.get("Error", {}).get("Code")
585 error_class = self.exceptions.from_code(error_code)
--> 586 raise error_class(parsed_response, operation_name)
587 else:
588 return parsed_response
The script doesn't identify methods where the token is not named
NextToken
as can happen. For example cognito-idp uses the parameterNextToken
forlist_users_in_group
, but usesPaginationToken
forlist_users
. I'm not sure how many other services use different naming conventions for the name of the token used for pagination.
@iann0036 investigated: https://github.com/iann0036/aws-pagination-rules
I also investigated: https://github.com/aws-cloudformation/cloudformation-cli/pull/663
unmerged PRs: https://github.com/boto/botocore/pull/1470, https://github.com/boto/botocore/pull/1847, https://github.com/boto/botocore/pull/2004, https://github.com/boto/botocore/pull/2018, https://github.com/boto/botocore/pull/2104, https://github.com/boto/botocore/pull/2177
I ran into a missing paginator for a function that uses "nextToken" instead of "NextToken"... it seems like this should have been fixed by now.
I have a script that generates paginators, constrained heavily to only generate those that I can be absolutely certain about. I just merged a PR stemming from that that added a ton. What we really need to do is to have that be part of our release automation. Right now we're relying pretty much entirely on getting them from upstream.
@JordonPhillips - would you be willing to share your script so others can use it?
CONTRIBUTING.rst mentions this topic first added here:
We may choose not to accept pull requests that change the JSON service descriptions... We generate these files upstream based on our internal knowledge of the AWS services. If there is something incorrect with or missing from these files, it may be more appropriate to submit an issue so we can get the issue fixed upstream.
I see one-offs being added to the code, but there are also several PRs sitting open (see above). Are one-offs now allowed?
@jamesls - since you updated CONTRIBUTING.rst, where is the upstream location that we can modify to get this fixed across SDKs for all languages?
Service teams are now the owners of their paginator models as those model definitions are shared across AWS SDKs. We are currently tracking paginator requests here in our cross-SDK respository: https://github.com/aws/aws-sdk/issues?q=is%3Aissue+is%3Aopen+label%3Apaginator. If you'd like to see a paginator for a specific service/API, please create an issue in that repository with your use case and request.
You can also consider reaching out through AWS Support for further escalation on these types of requests. But since the paginator additions would need to happen upstream rather than in botocore directly, I'm going to close this issue. Please let us know if you had any questions or feedback regarding this.
I brought this up at the openspace at pycon, there are lots of client methods missing paginator metadata. I went ahead and coded up simple script to identify all the missing paginators in botocore json sdk metadata.
Which results in the following output