EDIorg / ecocomDP

A dataset design pattern and R package for ecological community data.
https://ediorg.github.io/ecocomDP/
Other
32 stars 13 forks source link

Add full EML metadata to read_data() return object #141

Open clnsmth opened 2 years ago

clnsmth commented 2 years ago

Metadata is an integral part of ecocomDP data discovery and use.

While information-rich metadata is available to end users in the EDI and NEON data portals, it's not available in the R environment. Additionally, read_data() returns sparse and incomplete metadata with an inconsistent format in the EDI and NEON implementations.

Because EDI and NEON both publish detailed metadata in the EML standard, it should be possible to include it in the metadata field of the read_data() return object. While the EDI use case can be solved by simply reading in the EML from the archived L1 data package and attaching it to the read_data() return object, in NEON's case, we'll need to create EML "on-the-fly" during the L0-to-L1 mapping process since some returned metadata values are informed by user supplied arguments to read_data() (e.g. temporalCoverage, geographicCoverage, and taxonomicCoverage).

Returning EML metadata as an XML document, via xml2::xml_read(), facilitates working with XML and conversion to other user preferred representations (e.g. a native list object via emld::as_emld() or JSON-LD via emld::as_json()).

@sokole, how does this sound? What am I missing?