ropensci / EML

Ecological Metadata Language interface for R: synthesis and integration of heterogenous data
https://docs.ropensci.org/EML
Other
97 stars 33 forks source link

Constructors not speaking EML 2.2.0 #287

Open amoeba opened 4 years ago

amoeba commented 4 years ago

Stevan Earl noted that the constructor functions don't seem to have arguments for the EML 2.2.0 element.

As an example, if you try eml$project(), you won't see an award argument.

It looks like the args are just the EML 2.1.1 args. I checked the 2.2.0 database/JSON file and it looks good so it seems like the bug might somewhere like https://github.com/ropensci/EML/blob/master/data-raw/construct_complexTypes.R.

I do see this: https://github.com/ropensci/EML/blob/e4d8f9edc1465f98be382262ff762fb8029da11d/data-raw/construct_complexTypes.R#L6-L7 so I wonder if this ever worked. @jeanetteclark do you know?

mbjones commented 4 years ago

Maybe just the second options call on line 7 needs to be commented out so that 2.2.0 is set for the eml_db option?

cboettig commented 4 years ago

Yup, Matt's suggestion sounds good, though I'm not sure these constructors were ever quite as robust or convenient as you might want them to be.

I have an experimental project over at https://github.com/cboettig/build.eml which tries to generate constructor functions for all complex types which are actually first class R package functions, with documentation etc. That project could use some more love but I think the basics should work at least as well as the built-in templates? Ultimately that does feel like a better way to go.

(Of course this is also by scraping the XSD files, something that usually pushes me past the limits of my poor grasp of XPATH).

(emld also has a lightweight, relatively undocumented template() method for creating templates -- I think it should be up-to-date 2.2.0 but haven't checked that either!)

jeanetteclark commented 4 years ago

I can't get construct_complexTypes.R to run with the EML version set to 2.2.0.

Here is the traceback:

eml <- lapply(who, EML:::template_constructor)
Error in parse(text = f) : <text>:1:63: unexpected ':'
1: function(section = list(), para = list(), markdown = NULL, xml:
                                                                  ^ 
  4. parse(text = f) at template_constructor.R#12
  3. eval(parse(text = f)) at template_constructor.R#12
  2. FUN(X[[i]], ...) 
  1. lapply(who, EML:::template_constructor) 

which leads me to think it is due to thexml:lang elements here (not present in eml-2.1.1, not sure why). Unfortunately I really am not sure how to fix this in a good way but it seems like it needs to be handled in emld::template() maybe?

jeanetteclark commented 4 years ago

I figured out a solution that just tosses the xml:lang out but I'm not sure if that is what we want or not... See branches below:

https://github.com/jeanetteclark/EML/tree/constructor_functions https://github.com/jeanetteclark/emld/tree/constructor_fix

What I'm even less sure of is how we can support both the EML 2.2.0 and EML 2.1.1 versions of the constructor functions. Maybe an EML::eml_version() helper that will call emld::eml_version and run construct_complexTypes.R? Or we keep two rda files in the package and load one of them according to the options set by emld? Nothing very satisfying is jumping out at the moment

cboettig commented 4 years ago

@jeanetteclark thanks for this.

I think the xml:lang element in the eml-2.2.0.json files should be tossed out (probably should be removed from the .json file directly, that file just exists to assist with the constructors in the first place.

Yeah, we could probably cludge something together to let a user call eml_version and have it switch out which rda file it loads for the constructors, but it having constructors that behave differently as a side effect of another function call seems like it could lead to more trouble.

Still, I think breaking out the constructor task into a separate package like I've tried to do in build.eml might be a better way to go. Then you get first-class function constructors, like so: https://cboettig.github.io/build.eml/reference/eml.html

With better schema parsing, these docs could indicate better what is required vs optional, what is repeatable, and add some decent examples, but these first class functions are already better than the constructors tacked in here (in EML package) I think?

I suppose one could have two versions of the package, one for 2.2.0 and one for 2.1.1 if necessary.

amoeba commented 4 years ago

Just sent in two (dependent) PRs for EML (https://github.com/ropensci/EML/pull/308) and emld (https://github.com/ropensci/emld/pull/66) to fix the xml:lang parsing issue and set up constructors for EML 2.2.0 by default. There's more thinking to do here (or in another issue) but this at least gets things working. I think we might be able to sneak this into the next release if others are keen.