fititnt / wiki_as_base-py

[MVP] Use MediaWiki Wiki page content as read-only database. Python library implementation. See https://github.com/fititnt/openstreetmap-serverless-functions/tree/main/function/wiki-as-base
The Unlicense
0 stars 0 forks source link

Implement pagination for cli interface #2

Open fititnt opened 1 year ago

fititnt commented 1 year ago

While the current version only allows users to select pages one by one, the next release would also detect the prefix "Category:" and make an implicit call for the user. WikiMedia API defaults to 50 pages at once (500 for users with special limits, which we still do not support, but eventually could.

However, since the "Category:" already may sometimes load a bit more than 50 pages, we get an error like

{
      "error": {
        "code": "toomanyvalues",
        "info": "Too many values supplied for parameter \"pageids\". The limit is 50.",
        "limit": 50,
        "lowlimit": 50,
        "highlimit": 500,
        "docref": "See https://wiki.openstreetmap.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes."
      }

So this issue is about we make by default ignore if user add more than the typical low limit of number of pages, and then allow it paginate via additional cli request. One limitation that (unless cache is enabled) the next request would also ask again for the categories then load up to the next portion of the pages.

Anyway, the implementation of "#1", even if by default would have a short limit, makes sense, because the limit for pages in one categoryallows up to 500 even for non authenticated requests.

fititnt commented 1 year ago

The pagination, at least for first-level of pages for a category call, already works.