ropensci / EML

Ecological Metadata Language interface for R: synthesis and integration of heterogenous data
https://docs.ropensci.org/EML
Other
98 stars 33 forks source link

get_attributes does not generate factors table if there is only one codeDefinition #261

Open jeanetteclark opened 5 years ago

jeanetteclark commented 5 years ago

This metadata document has two attributes that are enumeratedDomain, with each codeDefinition list having one code/definition pair. get_attributes doesn't know how to handle this because it tries to apply this factors <- lapply(factors$codeDefinition, function(x) { as.data.frame(x, stringsAsFactors = FALSE) }) over the list of code = ... and definition = ... as opposed to the list of codeDefinition elements. I think this is related to the auto_unbox="TRUE" option in emld (discussed on slack).

Heres a reprex with an ugly solution to get around the problem:

t <- tempfile()
download.file("https://arcticdata.io/metacat/d1/mn/v2/object/doi%3A10.18739%2FA2FS1H", destfile = t)
doc <- read_eml(t)

atts <- get_attributes(doc$dataset$dataTable$attributeList)
atts$factors

# [1] x             attributeName
# <0 rows> (or 0-length row.names)

doc$dataset$dataTable$attributeList$attribute[[9]]$measurementScale$nominal$nonNumericDomain$enumeratedDomain$codeDefinition <- list(list(code = "VV", definition = "van Veen grab"))
doc$dataset$dataTable$attributeList$attribute[[10]]$measurementScale$nominal$nonNumericDomain$enumeratedDomain$codeDefinition <- list(list(code = "0.1", definition = "0.1 m2 van Veen grab"))

atts <- get_attributes(doc$dataset$dataTable$attributeList)
atts$factors

# code           definition attributeName
# 1   VV        van Veen grab     Gear_Code
# 2  0.1 0.1 m2 van Veen grab  Gear_Size_m2

Its a little bit of an edge case (doesn't make a ton of sense to only have one codeDefinition, IMO) but I thought I would report anyway