PyGithub / PyGithub

Typed interactions with the GitHub API v3
https://pygithub.readthedocs.io/
GNU Lesser General Public License v3.0
7.04k stars 1.79k forks source link

PaginatedList should support __len__() #1476

Closed s-t-e-v-e-n-k closed 3 years ago

s-t-e-v-e-n-k commented 4 years ago

PaginatedList has a totalCount attribute that is useful, but it should support len() as a more Pythonic way. Currently this breaks the test suite in interesting ways.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

sfdye commented 4 years ago

still valid

jesseli2002 commented 4 years ago

Regarding the test suite - would it be reasonable to create one (or more?) repositories under the PyGithub organization, for use in testing? It might not be possible to cover every usage but I imagine having dedicated repositories for testing would make it easier to update tests

huwcbjones commented 4 years ago

@s-t-e-v-e-n-k I've taken a look at this with my initial fix being implemented len (as below), then fix the test fallout.

class PaginatedList:
      def __len__(self) -> int:
        return self.totalCount

The immediate issue I've noticed whilst investigating the test fallout is that where anyone would do something like this

issues = list(repo.get_issues())

With __len__ implemented, this would result in an extra API call because of how list initialisation works. For those that don't know, the gist is: check sequence length (len call), create list, extend list (len call). Fortunately, totalCount caches the length which is good. I was also looking into other ways to reduce the number of API requests.

The first was to see if the PaginatedList could grow, if not, return the length of the elements in PaginatedListBase. However this required changing the access modifiers for PaginatedListBase (not sure what the rational behind that logic is since PaginatedListBase seems to only be used/extended by PaginatedList).

The second was some awful hacking whereby fetching the totalCount would cache the last element so we wouldn't have to re-fetch it if/when we iterated through the items later. However, I've already decided that it isn't worth pursuing since it would end up with an overly complicated, unmaintainable mess.

TL;DR: implementing __len__ would mean an extra API call in most cases.

Any suggestions/advice on how to tackle this issue? Is it worth tackling?

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

usamasaleem1 commented 11 months ago

Still having issue. Please reopen.

EnricoMi commented 11 months ago

What kind of issues precisely?