ropensci / EML

Ecological Metadata Language interface for R: synthesis and integration of heterogenous data
https://docs.ropensci.org/EML
Other
98 stars 33 forks source link

set_attributes check_and_complete_columns should be a warning not an error #272

Open atn38 opened 5 years ago

atn38 commented 5 years ago
library(EML)

atts <- data.frame(attributeName = c('length_1', 'length_2', 'length_3'),
                   attributeDefinition = c('def1', 'def2', 'def3'),
                   measurementScale = c('ratio', 'ratio', 'ratio'),
                   domain = c('numericDomain', 'numericDomain', 'numericDomain'),
                   unit = c('meter','meter', 'kilometer'),
                   extra_column = c("ID", "ID", "ID"),
                   stringsAsFactors = FALSE)

missing_values <- data.frame(attributeName = c("length_1", "length_2" ,"length_2"), code = c("1", "2", "3"), definition = c("A", "B", "C"))

atts1 <- set_attributes(atts, missingValues = missing_values)

MRE returns Error: The column names 'extra_column' in the attributes table are not recognized. This seems to be a new check. I think having extra columns (for whatever purpose, I had an ID column in there) should return a warning instead of an error. Thoughts?

jeanetteclark commented 5 years ago

Hi @atn38 - I added this check when I was working in set_attributes on your feature request. I could see arguments either way for whether this should be a warning or an error - I chose error because misspelling or mis-remembering a column name is fairly common, and the result can be invalid EML that is tricky to diagnose. Additionally, set_attributes already can spit out a lot of warning text so I was reluctant to add the potential for more. Your use case is a good argument for a warning, however. Happy to hear other's thoughts on this.

atn38 commented 5 years ago

Gotcha, I can see it both ways now. I can always remove the extra ID type columns before feeding it to set_attributes, since creating EML is often the last step in a workflow and there's little need for these columns past that. If nobody else's got strong opinions it can stay an error.