ropensci / EML

Ecological Metadata Language interface for R: synthesis and integration of heterogenous data
https://docs.ropensci.org/EML
Other
97 stars 33 forks source link

consider including an `eml_get_simple` function #303

Open jeanetteclark opened 4 years ago

jeanetteclark commented 4 years ago

eml_get is really helpful, but I find myself often needing to strip the return value down so that it is a simple vector that is easier to work with. For example, say I need to remove two entities from a list of otherEntities.

file_names <- eml_get(doc$dataset$otherEntity, "entityName")
remove <- c("file_1.csv", "file_2.csv")

i <- which(!(file_names %in% remove))

I can provide a full MRE if needed, but because of the output of eml_get still has the EML document context associated with it, the above returns unexpected results.

I have a very simple helper function that drops the context and attributes of eml_get

eml_get_simple <- function (doc, element) {
    out <- eml_get(doc, element, from = "list")
    out$`@context` <- NULL
    attributes(out) <- NULL
    out <- unlist(out)
    return(out)
}

Do we think that it would be worth including in EML? or perhaps the ability to pass a simple = TRUE argument to the existing eml_get function? I use that little helper every time I work with the EML package

cboettig commented 4 years ago

I think a PR for this would be great. Arguably it should be the default behavior for eml_get, at least for dropping @context. I'm a bit less clear on the dropping of attributes and the unlist, that seems like it could have unexpected consequences in some cases?

jeanetteclark commented 4 years ago

Yeah probably - the unlist especially is just nice when you are returning a value-only element (like entityName). You are right that it produces some weirdness for complex elements and probably shouldn't be included.