ropensci / EML

Ecological Metadata Language interface for R: synthesis and integration of heterogenous data
https://docs.ropensci.org/EML
Other
98 stars 33 forks source link

Should we provide native constructor functions for all complex types? #252

Open cboettig opened 5 years ago

cboettig commented 5 years ago

One big thing I'm on the fence about doing for the next release is whether we should create explicit constructor functions for all EML objects with complete documentation, like we do in https://github.com/cboettig/schemar. Currently we provide a single constructor function, eml which you can use tab completion to see all valid child elements / possible slots, but don't have native R documentation: e.g.

screen shot 2018-11-27 at 12 59 45 pm

A more complete implementation would create functions like creator that would have full documentation of slots, including which are required, etc. Doing so might risk namespace collisions or require prefixes though.

amoeba commented 5 years ago

I think this is probably a must if we're hoping a somewhat less-familiar user of EML might be using the package. Exhaustive docs would allow us to provide docs for not only which sub-elements are allowed but guidance on best-practices around their use.

And I don't think prefixing with eml_ would be that bad if we had to go that route to avoid masking. Off the top of my head, I don't think EML element names collide too much with any names from the base set of packages.

I'm curious what others think.

cboettig commented 5 years ago

Yeah, full docs would be quite nice -- my intent is to pull all the documentation from the schema files. I need to up my XML parsing game to pull that off apparently. (I admit I was impressed how easy it was to extract the documentation for all of schema.org, which is all in RDFa... just a few sparql queries and I had what I needed... )

I'm worried about namespace collisions or possibly just confusion, since a lot EML node names are just super-generic. At least some I can spot as clear collisions with some things: (map, title, text, unit, complex, url, ...) but maybe that's ok?

Also a bit wary of having a huge namespace. I think it may make the package noticeably laggy to load (like schemar is, with it's 624 classes; I think we have over 200 in EML) and may make it harder to find the some of the helper functions. maybe it's a non-issue though.

I might attempt this in a separate package, at least for starters. Creating these constructors wouldn't depend on anything in EML, it wouldn't even depend on anything in emld just to create the objects.

cboettig commented 5 years ago

@amoeba ok, I made a crude mock-up of this approach at https://github.com/cboettig/build.eml/issues/2. thoughts welcome as to if this is a good direction to pursue further...