Open 0xdevalias opened 2 days ago
If anyone is interested, I ended up implementing a prototype of this as a wrapper script:
Example usage:
⇒ paginate-fetch --client='restish' --count-param='rows' --array-key='.txArray' --total-key='.txNums' 'https://api.example.com/api/v1/txs?page=1&rows=100' > out.json
Pagination ended: Reached total count (108).
⇒ jq 'length' out.json
108
It would be cool if
restish
supported automatic pagination strategies for URL params such aspage
+count
(orrows
, etc) alongside its existing hypermedia pagination features:I've been looking for a good tool that can handle this, similar to how the GitHub CLI implements pagination for its
gh api
command; but more generically applicable to any API:I'm not sure what the most common form of these params are, but I think figuring that out and using it as the default would probably be ideal, and then allowing each 'type' of parameter to be optionally configured for flexibility.
eg. Lets say the defaults are
page
andcount
, and we callrestish
with a URL like this, which will return page 1, with up to 100 transactions on it:The URL doesn't include any params that match the known pagination param name defaults, so no pagination would occur.
However if we provided overrides to tell it what the param names are:
It could then automatically handle pagination; though since there is nothing that tells it when it should stop, some kind of default 'stop case' should be provided.
We could extend this further to include a default param in the JSON response body for the
totalCount
of records returned, and another for the key of the 'container' array that holds them (eg.records
). All of these would be able to be overridden as well.Given those defaults, and this example api providing
txNums
,txArray
, and no total, we could call it something like this:It would know to look at
txNum
andtxArray
due to the overrides, and it would know to stop trying to fetch more once we had fetched the 'total count' of records.There could also be another 'stop strategy' for APIs that don't include the total count; which could be as simple as stopping if the 'container array' (eg.
records
) was empty.It could even be simplified further, so that if the
--paginate-param-*
args are defined, there would be no need to include them in the URL itself: