OpenFn / adaptors

The new home for OpenFn adaptors; re-usable connectors for the most common DPGs and DPI building blocks.
GNU General Public License v3.0
4 stars 8 forks source link

commcare: add auto pagination to `get` #569

Open josephjclark opened 1 month ago

josephjclark commented 1 month ago

At the moment, in commcare users have to handle pagination manually.

API design

I think that by default we should autofetch. Ie, pull down all the records.

We should pull down in batches defined by limit, which users can customize but we default to something quite high like 1000.

If an offset is passed (including offset: 0), we do not auto-page and assume that users will take control

The callback function should be invoked for each page (check this - I think this is what other chunking style operations do?)

I wonder if we should have some kind of maxlimit, set to a million or something, and if there are more records than that limit, we throw an error and say "woah, there's too much data here, you need to do something". But I don't know, I'd rather we had a library, platform and runtime capable of processing all that data. And batching up should reduce the overhead anyway.

How to do it today

I see two strong patterns to paginate commcare (thanks to Professor @mtuchi!)

1) Use edge conditions to run the job again

state.meta.next // if state.metada.next is truthy, there is more data and we should re-run from the next offset

This is a really nice way to use the workflow to loop.

2) Get the total, generate batches, and iterate over them

// fetch no data
get('cases', { limit: 0, offset: )`

// read the total count
fn((state) => {
  const count = state.meta.total_count

 // work out all the offsets we need to fetch one page at a time
  const offsets = [0]
  for(const i = 1000;i<total_count;i+=1000) {
    offsets.push(i)
  }
 state.data.offsets = offsets
 return statte
})

each($.data.offsets, get('case'), { offset: $.data )