ropensci / ruODK

ruODK: An R Client for the ODK Central API
https://docs.ropensci.org/ruODK/
GNU General Public License v3.0
42 stars 13 forks source link

Export multiple form submissions from the same project at same time. #136

Closed SpikeLewis closed 2 years ago

SpikeLewis commented 2 years ago

Hi all, I was wondering if there is a shorter way to solve my issue:

I have 1000s of forms within a single project (expecting 50-100 submissions on each one over the next year). My current set up is to export each form individually to analyse data. Is there a way to bulk download all the forms at once?

Other than repeating the set up and export code 1000s of times (doable, but clunky). I'm using project level encryption in odk central.

It would be great to hear your thoughts on this- and sorry if this is a repeated issue.

Best, Spike

florianm commented 2 years ago

Not at all a problem, although not a software bug. Interesting setup! I've never seen 1000s of forms within a project. This is an interesting use case - what drives the need for so many forms? Who generates and maintains the app user access for these form definitions? Are these forms very similar in nature and could be merged into one form?

Assuming that simplifying forms isn't an option:

You could use ruODK to list all forms within a project (form_list()) and in a loop over each form ID in that list, download the data with odata_submission_get() (if unencrypted) or submission_export() (only option with encrypted forms). You could also use purrr::pmap() to apply odata_submission_get()/submission_export() over the dataframe returned by form_list(). You will still end up with (a list of) 1000s of objects in R corresponding to your 1000s of forms.

For analysis:

All the above could be done in a targets pipeline with dynamic branching. Keen to hear more about your use case.

SpikeLewis commented 2 years ago

Hi Florian, Thanks for your quick response! I know the set up is crazy: We are working with 400 participants in four countries, with two default languages (whichever they choose to respond in) in each country and two types of user (container based sanitation users and non users), and we are giving participants phones & data in exchange for filling in weekly tasks (which are repeated over the year). We have a front end app to filter the odk forms as the participants wont be working with enumerators after their initial workshop) so it needs to be simple, they will only see 6 forms that they can fill in in their own time that week, prompted by notifications, when a new form is available- they're pretty short- 5/10 mins max per weekday. So yes, we have about 7 different forms that are very simillar, that are repeated 52 times! I'm sure there was a better way, but we're rolling with it now.

Generating form access was me, doing lots of clicking on central, but slowly getting my head around using the API.

The main thing I need to extract is just the points they have scored (to convert to data/ talk time) so the analysis for this is pretty simple, but just getting the data down is the main thing, for the weekly phone top-ups.

We ran a similar project (just in one country, one language) using aggregate before, but thought central would be smoother for the additional load on this one- delighted we can use R. We are running project level encryption, so I'll give the loops a whirl- thank you! Spike

florianm commented 2 years ago

Hi Spike,

very interesting, thanks for the details! If there's another round of this, the forms could probably be rationalised into those seven foundational forms by adding a few questions and form logic. By the sounds of it you could probably merge the form data (bind_rows or rbind) from all repetitions of each of the seven different forms in a loop or a map/apply. Good luck! I'm closing the issue, but please do let us know here how you went and what worked. I might post the solution to the ODK forum in the Support category.