bedatadriven / activityinfo-R

ActivityInfo R Language Client
https://www.activityinfo.org/support/docs/R/
18 stars 12 forks source link

Duplicate labels causes error in getRecords() #118

Closed jamiewhths closed 7 months ago

jamiewhths commented 8 months ago

From @Ryo-N7:

When using the getRecords() + collect()​ workflow to grab data from forms, there is an error when there are duplicate field labels in a form (example: https://www.activityinfo.org/app#form/c6ipg3tldcxr80y2/table).

library(activityinfo)

formid <- "c6ipg3tldcxr80y2"

## this works as it uses the field code first and then field label
casedf_all <- getRecords(form = formid, style = allColumnStyle()) %>% collect()

## these do NOT work as it tries to use field label first, but since duplicates exist it throws error
casedf_pret <- getRecords(form = formid, style = prettyColumnStyle()) %>% collect()
casedf_minim <- getRecords(form = formid, style = minimalColumnStyle()) %>% collect()

## this also works even though we are specifying the two fields with the same exact label
## it simply automatically fixes the second `thisfield` by editing it to be `thisfield.1` instead
casedf_qt <- queryTable("c6ipg3tldcxr80y2",
                     "Case Number" = "cwmnsr9ldcxr80y3",
                     "Case worker Name" = "caseowner.name",
                     "thisfield" = "clnld9flt8g5hi12",
                     "thisfield" = "cqzgq1xlt8g5u6e3",
                     "comments" = "cuvg9mwlt8g62954")

Previously with queryTable()​ this wasn't a problem because it had its own methods to automatically convert duplicated field names as shown in the above example.

Also this is only a problem when using prettyColumn​ and minimalColumn​ style because the defaults here creates the returned output with field labels whereas allColumn​ style uses the field code first before trying field labels if they don't exist. If you do a traceback, the error is coming from collect()​ where the function is trying to structure the data back into a tibble/data.frame. Since there is no default set on how to handle duplicate column names (like in queryTable), this causes the function to simply throw an error and not return anything which can be frustrating as then we have to go into the form and manually edit the field name to get it to work.

The most likely solution is to specify the .name_repair​ somewhere with a default like how queryTable handles it.

image

nickdickinson commented 7 months ago

Did the demo db in the Introduction vignette change? It seems like it has a duplicate column now "Monthly reports" (also a sub-form). To fix the build checks, I will need to address this issue now in 4.36. records <- getRecords("ceam1x8kq6ikcujg") |> collect() https://www.activityinfo.org/support/templates/3w.html

nickdickinson commented 7 months ago

This is addressed in 4.36