boto / botocore

The low-level, core functionality of boto3 and the AWS CLI.
Apache License 2.0
1.5k stars 1.09k forks source link

Publicize paginator's result_key_iters() #1535

Open copumpkin opened 6 years ago

copumpkin commented 6 years ago

For ages now, I've been doing the "ask botocore to paginate for me, iterate over its pages, then iterate over the items in the result" canonical code. But after implementing some paginators and realizing that the paginator config includes the result_key key, I started wondering why botocore didn't abstract the pages away from us completely. Poking around the source code, I found the undocumented but perfect result_key_iters() method, which allowed me to flatten all my page iteration and avoid looking up the (hugely inconsistent) result keys by hand.

Is there a reason this isn't part of the public documented API? I imagine part of it is because there can occasionally be multiple result keys and the ensuing API design might be less obvious, but I wouldn't let the perfect be the enemy of the good. It's already far more pleasant than the repetitive stuff we all write daily for pagination.

joguSD commented 6 years ago

This isn't public currently because it's specific to the botocore implementation of pagination and it differs from how the auto-pagination works in other SDKs (ruby, go, etc. they only offer page iteration). Botocore is a little different here because it powers the AWS CLI which has it's own special requirements for pagination (result key aggregation).

I also think the interface for this function is a little strange. Ideally, I think we would support item level iteration with an interface like this maybe:

# Create a reusable Paginator
paginator = client.get_paginator('list_objects')

for item in paginator.paginate(Bucket='my-bucket').items_iterator():
    # gives you each item in the default result_key (`Contents`).
    print(item)

And if there were multiple paginated result keys the items_iterator could take an optional result_key param that defaults to the first result key.

Either way, marking this as a feature request for item level pagination.

copumpkin commented 6 years ago

Thanks for clarifying! Yes, I basically just want to not have to think about the pages when I know botocore has enough information to avoid it 😄

copumpkin commented 6 years ago

@joguSD also, your proposed interface seems fine. Basically, it seems like this would be pretty good:

def items_iterator(self, result_key=None):
    if result_key:
        # Some way to look up the result_key_iters by key name, since the index is meaningless to anyone who doesn't have the paginator JSON in front of them
    else:
        return self.result_key_iters()[0]

Does it seem like a PR adding that would be useful? I don't want to do the work if you think it'll just sit in limbo for years, but it also seems pretty nice 😄

maxisentia commented 1 year ago

Hi. Any news on that?

Or can I get relult_key from a Paginator object, please?

EDIT: Oh, I found the way:

ec2 = session.client('ec2')
items_paginator = ec2.get_paginator('describe_network_interfaces')
items_key = items_paginator.result_keys[0].parsed['value']
print(items_key)

'NetworkInterfaces'