danielnsilva / semanticscholar

Unofficial Python client library for Semantic Scholar APIs.
MIT License
302 stars 39 forks source link

`PaginatedResults` has no asynchronous iterator #93

Open dmoklaf opened 2 weeks ago

dmoklaf commented 2 weeks ago

PaginatedResults has a synchronous iterator, but no asynchronous one :

https://github.com/danielnsilva/semanticscholar/blob/1d94dfff5fa433037f009bd6471e4bd5f86dbc91/semanticscholar/PaginatedResults.py#L111

This breaks the ability of the semanticscholar API to be used asynchronously with a loop over query results.

danielnsilva commented 2 weeks ago

The lack of an async iterator is actually intentional. The idea was to keep the pagination process sequential to make sure results come in the right order. A synchronous iterator helps control the flow and ensures no pages are skipped or processed out of order. If we made it async, it could mess with the order of the results.

dmoklaf commented 1 week ago

The lack of an async iterator is actually intentional. The idea was to keep the pagination process sequential to make sure results come in the right order. A synchronous iterator helps control the flow and ensures no pages are skipped or processed out of order. If we made it async, it could mess with the order of the results.

Having async functions (here an iterator being async) doesn't authorize library users to make calls in parallel to the same function, from different coroutines. If that were true, most async libraries would crash or be inconsistent (e.g., return items out of order as you have correctly suggested). Async only authorizes the library user to reuse the main thread for other things in between, through other coroutines. Nothing more.

Consequently, sync and asynchronous APIs can be rigorously equivalent in most cases.