IDEMSInternational / R-Instat

A statistics software package powered by R
http://r-instat.org/
GNU General Public License v3.0
38 stars 102 forks source link

ODK R Functions #8564

Open lilyclements opened 1 year ago

lilyclements commented 1 year ago

In R-Instat, we created a dialog on this back in 2016/7 to import from ODK, but this has various issues. From a brief look, there are different packages now available in R.

For example - https://docs.ropensci.org/ruODK/ https://forum.getodk.org/t/simple-r-package-for-pulling-and-exporting-forms-from-an-odk-server/32603 https://github.com/rapidsurveys/odkr

We want to investigate different packages that import ODK data into R.

Try not to spend more than an hour on this. You may find the solution, or you may find some insights - even if that's to cross a few options off! I'm happy to get involved and to discuss too if you want; for example, @rdstern said an ideal package for R-Instat is in CRAN. We can discuss bits like this if you are unsure what CRAN is.

EVANSTATS commented 1 year ago

Hello @lilyclements and @rdstern. I have looked at different sites for information on sourcing ODK data into R, and at the moment there is no specific package available on CRAN for that. However, the odkr package used to be on CRAN but it has been removed. Based on the sites I checked, the ruODK package is more popular and is the most recently updated on Github (about 9 months ago). These two packages could provide a good starting point for this. It's also worth noting that there are instances in which people have used APIs to source the data.

EVANSTATS commented 11 months ago

ODK are recommending here using ruODK package to download submissions

lilyclements commented 11 months ago

@EVANSTATS this looks very interesting - nice find. How does this work when importing your data into R with it?

rdstern commented 11 months ago

@EVANSTATS and @lilyclements we now have a plan to produce a new version of R-Instat in time for the AIMS Ghana course, which starts on 5 November. Is there a chance we can make some progress on the ODK work by then. In week 3 the students do a small group project, and one option has been for them to conduct a small survey using odk. It would be great if they could then use R-Instat?

The eventual goal is for the dialog to be improved, but it may be, with this short time scale, that the best we can do is to be able to use the new ideas in RStudio via a script. That would be fine, as I hope that the script could then be used in R-Instat.

If the time scale isn't possible, then perhaps some (more?) progress can be made in November, when Lily will be in Ghana, mainly to work with the students.

rdstern commented 11 months ago

@EVANSTATS you have gone quiet. Our AIMS course starts on Monday. Anything to help from you?

rdstern commented 11 months ago

@EVANSTATS you are still quiet. @lilyclements I know IDK was used for the little survey, but assume you didn't use R-Instat there?

rdstern commented 11 months ago

@EVANSTATS we need a sign of life. Should we be getting another team member to take over the ODK stuff?

rdstern commented 10 months ago

@EVANSTATS I understand, from @Fidel365 you are waiting for an answer from @lilyclements to some question? But I don't see your question? It should be here - on this thread. All I see is lily's question to you, above, on October 16th. If you have a question or are stuck then ask her again and put it here, on github.

I don't understand your need for silence? Let's keep in touch.

EVANSTATS commented 10 months ago

Sorry for the silence. My email notifications were off and I had not checked this. Thank you @Fidel365 for reaching out. I have a call with @lilyclements tomorrow and we will discuss what next on this.

EVANSTATS commented 10 months ago

Hello @rdstern. I did discuss this with @lilyclements and @Fidel365. We have a code that can source data into R from KOBO servers. The code takes three inputs - the unique form ID, the Kobo URL, and the server token. I don’t have an ODK account since they charge for it, so I haven’t tried the code on data stored in ODK. I will be checking the KoboLoader package which was also mentioned by Sam in the email. Do you have other thoughts on this other than sourcing for the data?

Below is the code we are currently using.

get_odk_form_data <- function(formid, orgurl, token = NULL) { baseurl <- orgurl api_link <- paste0 (baseurl, "/api/v2/assets/", formid, "/data/?format=json")

if (is.null(token)){ data_inf <- httr::GET(api_link) } else { data_inf <- httr::GET(api_link, config = httr::add_headers(Authorization = paste("Token", token))) } content_json <- httr::content(data_inf, as = "text") list_inf <- jsonlite::fromJSON(content_json, flatten = TRUE) df_list <- list_inf$results return(df_list) }

rdstern commented 10 months ago

@EVANSTATS many thanks for the update. I look forward to you making further progress - I am the wrong person to come to, with detailed implementaion questions. I hope you will soon solve licencing issues, as IDEMS has an "open by default" policy. So I am surprised you need funds to be provided - and by whom - at this early stage. Does everyone need funds to use ODK? So I look forward to your first pull request and hope it will be partnered by interesting examples. @volloholic is of the view that this is a key improvement we need for the next release.

lilyclements commented 10 months ago

@rdstern what improvements did we want from this ODK function that the current ODK function in R-Instat does not or cannot do? I think knowing the problem with the current function will help us to know if this function is enough or if more is wanted from it. Thank you!

rdstern commented 10 months ago

@lilyclements that's another question I need someone else to answer. @volloholic said the current version isn't very good, and this was an essential improved feature in the forthcoming version. I assumed IDEMS had some ODK users - even in the agroecology work led by Lucie? What a puzzle. I assume it works fine already on the AIMS surveys?

lilyclements commented 10 months ago

As a non-ODK user it's hard to know what we cannot do, particularly without sample code. This code works on the sample codes we've found. We can expand and ask Lucie (or others at IDEMS/INNODEMS) for examples so we can stress test it further.

rdstern commented 10 months ago

@EVANSTATS looking ahead we will want examples and training materials on these surveys. Here is a video from Stats4SD.

Can you also ask Lucie for examples? I understood from David (when training in a workshop where Lucie was also present), that he found the existing code to be severly lacking then.

EVANSTATS commented 10 months ago

I talked to Lucie, and we realized that R-Instat is not able to correctly import forms with repeat groups. She also shared three such forms which I have used for further checking. The R code above also doesn't handle repeat groups properly.

When importing a file with multiple sheets, the Import Dataset dialog prompts you to choose the sheets for import. I think we can apply a similar approach when importing a form with repeat groups from ODK.

lilyclements commented 10 months ago

@EVANSTATS I like this idea a lot! Really nice suggestion.

You also pointed out elsewhere that the error code if the password is incorrect is not very useful at all, giving us two points of improvement so far

What would be useful now is to know is if importing a file with multiple sheets works with the R function you've been working on? Looks like great progress!

rdstern commented 7 months ago

@EVANSTATS you have been quiet again? Could you get to your first pull request soon?

EVANSTATS commented 7 months ago

Hello @rdstern. We have had discussions on exploring and understanding the Formshare platform lately. It has unique features setting it apart from ODK, like storing data in relational tables, supporting longitudinal studies, and generally data management capabilities such as flagging double entries. We had a call with Carlos from Formshare, who walked us through how it works. I even got to test an xlsx form from a user's perspective. Ian Stride also shared his technical perspective on the platform, and we look forward to another discussion.

rdstern commented 7 months ago

@EVANSTATS many thanks for the update. But you only mention another discussion. Could you also tell @lilyclements and me, when and how you are planning to move to a pull request where we see that something is changing? I'm hoping for a major set of changes in a new version of R-Instat within 2 months. Do you think we can include some ODK improvements? If you need to work together with more of the development team, then tell us.

I'd also like to know of examples that we could use in documentation. This could start with a very simple example, perhaps, as well as at least one that shows how to cope with more complicated questions, like multiple responses.

rdstern commented 4 months ago

@EVANSTATS now it's May 2024, and you still have to reply to my message here of 27 February. Are you still at INNODEMS? Are you ever planning to return to this issue? Thanks, Roger