At the moment, in commcare users have to handle pagination manually.
API design
I think that by default we should autofetch. Ie, pull down all the records.
We should pull down in batches defined by limit, which users can customize but we default to something quite high like 1000.
If an offset is passed (including offset: 0), we do not auto-page and assume that users will take control
The callback function should be invoked for each page (check this - I think this is what other chunking style operations do?)
I wonder if we should have some kind of maxlimit, set to a million or something, and if there are more records than that limit, we throw an error and say "woah, there's too much data here, you need to do something". But I don't know, I'd rather we had a library, platform and runtime capable of processing all that data. And batching up should reduce the overhead anyway.
How to do it today
I see two strong patterns to paginate commcare (thanks to Professor @mtuchi!)
1) Use edge conditions to run the job again
state.meta.next // if state.metada.next is truthy, there is more data and we should re-run from the next offset
This is a really nice way to use the workflow to loop.
2) Get the total, generate batches, and iterate over them
// fetch no data
get('cases', { limit: 0, offset: )`
// read the total count
fn((state) => {
const count = state.meta.total_count
// work out all the offsets we need to fetch one page at a time
const offsets = [0]
for(const i = 1000;i<total_count;i+=1000) {
offsets.push(i)
}
state.data.offsets = offsets
return statte
})
each($.data.offsets, get('case'), { offset: $.data )
At the moment, in commcare users have to handle pagination manually.
API design
I think that by default we should autofetch. Ie, pull down all the records.
We should pull down in batches defined by
limit
, which users can customize but we default to something quite high like 1000.If an offset is passed (including
offset: 0
), we do not auto-page and assume that users will take controlThe callback function should be invoked for each page (check this - I think this is what other chunking style operations do?)
I wonder if we should have some kind of
maxlimit
, set to a million or something, and if there are more records than that limit, we throw an error and say "woah, there's too much data here, you need to do something". But I don't know, I'd rather we had a library, platform and runtime capable of processing all that data. And batching up should reduce the overhead anyway.How to do it today
I see two strong patterns to paginate commcare (thanks to Professor @mtuchi!)
1) Use edge conditions to run the job again
This is a really nice way to use the workflow to loop.
2) Get the total, generate batches, and iterate over them