ropensci / RNeXML

Implementing semantically rich NeXML I/O in R
https://docs.ropensci.org/RNeXML
Other
13 stars 9 forks source link

Conversion between EML and NeXML (at least for character data) #40

Closed cboettig closed 5 years ago

cboettig commented 10 years ago

Given a csv (or excel?) file containing (phenotypic) character data and an associated EML metadata file describing it, can we completely serialize this csv file and associated metadata into RNeXML characters nodes?

rvosa commented 10 years ago

Nice use case. Do you have an example EML file?

cboettig commented 10 years ago

@rvosa Not one that includes a tree off the top of my head, but it's pretty easy to find examples that have character trait data across a range of species just by searching KNB. For instance, here's a kinda cool example for carnivorous plant species: https://knb.ecoinformatics.org/knb/metacat?action=read&qformat=knb&sessionid=0&docid=knb-lter-hfr.168.3

For instance, the first CSV file includes a continuous trait (growth rate) and a discrete trait (habitat) for a range of species, with some species having multiple observations and others not.

One of the awesome things about KNB is that it indexes all the metadata, so it is possible to query for data matching a particular attribute or species. Unfortunately, there's no semantics around those attributes, so you just have to hope the data creator describes the trait with some of the same words that you do. I was poking around the phenoscape project website and papers this morning and the project is kinda mind-blowing the scope and depth of expression there.

cboettig commented 5 years ago

I think EML descriptions of trait data files would generally be too imprecise to map to either NeXML character matrix or other richer semantic versions. Going the other way would be more straight forward of course (generating an EML entry for NeXML).