Closed dmenne closed 3 years ago
Thanks Dieter! I'll add this to form_schema_parse (used up to odkc v7) and submit a feature request for the new form_schema (direct JSON from odkc) to include these.
Better not yet. I will try more forms and update you. This was a first try.
Part of me thinks that it could be useful to add something like this to the Central API: there may be enough use cases where a user needs these labels that the best approach would be for Central to provide that information. Feel free to create a topic in the Features category of the ODK forum if something along those lines would be helpful!
@matthew-white it would be awesome to have labels and hints included in https://odkcentral.docs.apiary.io/#reference/forms-and-submissions/'-individual-form/getting-form-schema-fields, e.g. as nested list
{
"name": "age",
"path": "/age",
"type": "int",
"label": {"en": "Age", "de": "Alter"},
"hint": {"en": "...", "de": "..."}
},
Crosslink: Feature request on the ODK Forum
I submitted an issue at Central (https://github.com/getodk/central/issues/172)
@dmenne
This is a good solution:
ru_setup( svc = "https:xxxxx", un = "yy", pw = "xx", tz = Sys.timezone() )
f_xml = as_xml_document(form_xml()) ff_xml = tibble( path= str_sub(xml_text(xml_find_all(f_xml, "//translation/text/@id")),6), label = xml_text(xml_find_all(f_xml, "//translation/text")) )%>% separate(path, sep=":", into=c("path", "type")) %>% pivot_wider(names_from=type, values_from=label)
fs_extended = form_schema(flatten = FALSE) %>% left_join(ff_xml, by="path")
but please be aware that this works only when there are multiple translations. If there is a single language, labels are stored in a different way (there's no translation
element in the XML tree).
I'll submit a suggestion soon.
I prepared this function:
library(xml2)
# the function below uses the exact function signature as form_schema()
# in that sense, you could replace any call to form_schema by form_schema_ext
# it gets in addition to the form_schema columns, the common label, and the multilanguage labels if available
# it gets also the choice list and labels, in multilanguage if existing
form_schema_ext <- function (flatten = FALSE, odata = FALSE, parse = TRUE, pid = get_default_pid(),
fid = get_default_fid(), url = get_default_url(), un = get_default_un(),
pw = get_default_pw(), odkc_version = get_default_odkc_version(),
retries = get_retries(), verbose = get_ru_verbose())
{
# gets basic schema
frm_schema <-form_schema (flatten, odata , parseE, pid ,
fid, url, un,
pw , odkc_version,
retries, verbose)
# gets xml representation
frm_xml <- as_xml_document(form_xml (parse, pid, fid,
url, un, pw ,
retries))
### parse translations:
all_translations <- xml_find_all(frm_xml, "//text")
# initialize dataframe
extension <- data.frame(path = character(0), label = character(0),
stringsAsFactors = FALSE)
### PART 1: parse labels:
raw_labels <- xml_find_all(frm_xml, "//label")
# iterate thorugh labels
for (i in 1:length(raw_labels)){
## path
# gets ref from parent, without leading "/data"
this_path <- sub("/data", "",
xml_attr(xml_parent(raw_labels[i]), "ref"),
6)
# ensure this is a valid path
if (!is.na(this_path)) {
# adds new empty row:
extension[nrow(extension)+1, ]<-rep(NA, ncol(extension))
# adds path
extension[nrow(extension), 'path'] <- this_path
## reads label
this_rawlabel <-raw_labels[i]
# first checks if it is multi-language label
multi_lang <- xml_has_attr(this_rawlabel, "ref")
if (multi_lang) {
# if multi-language, finds all translations related to this path:
id <- paste0("/data", this_path, ":label")
translations <- all_translations[xml_attr(all_translations, "id") == id]
# iterate through translations
for (j in 1:length(translations)) {
# first check this is a regular text labels. Questions in ODK can have video, image and audio "labels",
# which will be skipped. This is identified by the presence of the 'form' attribute:
is_regular_label <- !xml_has_attr(xml_find_first(translations[j],"./value"), "form")
if (is_regular_label) {
# reads the parent node to identify language:
translation_parent<- xml_parent(translations[j])
this_lang <- gsub(" ", "_", tolower(xml_attr(translation_parent, "lang")))
# decide if 'default' language or specific language
if (this_lang == "default") {
# if 'default' language, save under column 'label':
extension[nrow(extension), 'label'] <- xml_text(xml_find_first(translations[j],"./value"))
}
else {
# check if language already exists in the datafram
if (!(paste0("label_",this_lang) %in% colnames(extension))){
# if not, create new column
extension <- cbind(extension, data.frame(new_lang = rep(NA, nrow(extension))))
colnames(extension)[ncol(extension)] <- paste0("label_",this_lang)
}
# adds the first value content of the translation
extension[nrow(extension), paste0("label_",this_lang)] <- xml_text(xml_find_first(translations[j],"./value"))
}
}
}
}
else {
# extract content
extension[nrow(extension), 'label'] <- xml_text(this_rawlabel)
}
### PART 1.1: parse choice labels
## checks existence of choice list:
choice_items<-xml_find_all(xml_parent(this_rawlabel), "./item")
if (length(choice_items)>0) {
# check if 'choices' column already exist
if (!('choices' %in% colnames(extension))){
# if not, create new column
extension <- cbind(extension, data.frame(choices = rep(NA, nrow(extension))))
}
# initialize lists
choice_values <- list()
choice_labels <- list()
# iterate through choice list:
for (jj in 1:length(choice_items)) {
#value
this_choicevalue<-xml_text(xml_find_first(choice_items[jj], "./value"))
choice_values[jj]<-this_choicevalue
# raw label
this_rawchoicelabel <- xml_find_first(choice_items[jj], "./label")
# first checks if it is multi-language choice label
multi_lang_choice <- xml_has_attr(this_rawchoicelabel, "ref")
if (multi_lang_choice) {
id_choice <- paste0("/data", this_path,"/",this_choicevalue, ":label")
choice_translations <- all_translations[xml_attr(all_translations, "id") == id_choice]
# iterate through choice translations
for (kk in 1:length(choice_translations)) {
# first check this is a regular text labels. Questions in ODK can have video, image and audio "labels",
# which will be skipped. This is identified by the presence of the 'form' attribute:
is_regular_choicelabel <- !xml_has_attr(xml_find_first(choice_translations[kk],"./value"), "form")
if (is_regular_choicelabel) {
# reads the parent node to identify language:
choice_translation_parent<- xml_parent(choice_translations[kk])
this_choicelang <- gsub(" ", "_", tolower(xml_attr(choice_translation_parent, "lang")))
# decide if 'default' language or specific language
if (this_choicelang == "default") {
# if 'default' language, save under 'choice':
choice_labels[['base']][jj] <- xml_text(xml_find_first(choice_translations[kk],"./value"))
}
else {
# check if language already exists in the dataframe
if (!(paste0("choices_",this_choicelang) %in% colnames(extension))){
# if not, create new column
extension <- cbind(extension, data.frame(new_choicelang = rep(NA, nrow(extension))))
colnames(extension)[ncol(extension)] <- paste0("choices_",this_choicelang)
}
# adds the first value content of the translation
choice_labels[[paste0("choices_",this_choicelang)]][jj] <- xml_text(xml_find_first(choice_translations[kk],"./value"))
}
}
}
}
else {
choice_labels[['base']][jj]<- xml_text(this_rawchoicelabel)
}
}
# add to the extended table:
for (this_choicelang in names(choice_labels)) {
these_choicelabels <- choice_labels[[this_choicelang]]
if (this_choicelang == "base"){
this_choicelang_colname <- "choices"
}
else {
this_choicelang_colname <-this_choicelang
}
extension[nrow(extension), this_choicelang_colname] <- list(list(list(values = unlist(choice_values),
labels = unlist(these_choicelabels))))
}
}
}
}
# join:
fs_ext <- frm_schema %>% dplyr::left_join(extension, by = "path")
##
return(fs_ext)
}
On top of the function from @dmenne , this provides also choice lists and handles multiple languages:
Here is an example output from a form with a multiple-language labels and single-language choice-list:
path | name | type | ruodk_name | label | labelenglish(en) | labelfrench(fr) | choices |
---|---|---|---|---|---|---|---|
/some_text | some_text | string | some_text | NA | This is a basic fill in the blank question. | (FRENCH) This is a basic fill in the blank question. | NA |
/text_image_audio_video_test | text_image_audio_video_test | string | text_image_audio_video_test | NA | This question shows how to use translations and media types. | This question shows how to use translations and media types. | NA |
/a_integer | a_integer | int | a_integer | NA | Enter a integer: | Enter a integer: | NA |
/a_decimal | a_decimal | decimal | a_decimal | NA | Enter a decimal: | Enter a decimal: | NA |
/calculate | calculate | string | calculate | NA | NA | NA | NULL |
/calculate_test_output | calculate_test_output | string | calculate_test_output | NA | The sum of the integer and decimal: | The sum of the integer and decimal: | NA |
/test_yn | test_yn | string | test_yn | NA | What do you think? | Ça va? | list (values = (0 , 1 , 99), labels = ("Yes" ,"No", "Maybe") |
/meta | meta | structure | meta | NA | NA | NA | NULL |
/meta/instanceID | instanceID | string | meta_instance_id | NA | NA | NA | NULL |
I haven't stress tested it, but I'll try to turn it into a pull request once i have the time.
Feel free to test and comment.
Nice work! This would warrant a new test form. Unit tests could run against that form, and also against the current forms without translations.
Great! I had already noted that the function failed sometimes, but did not have the time to test out why. You saved my days!
A snippet to get labels and hints into the form_schema. Must be improved when translations are present. "guidance" field may need a second look, and an XPATH expert could simplify the paths.
Feel free to use or not
Here an example from the xml-file