ropensci / dataspice

:hot_pepper: Create lightweight schema.org descriptions of your datasets
https://docs.ropensci.org/dataspice
Other
159 stars 26 forks source link

Do a metadata standard/methods review #14

Closed amoeba closed 6 years ago

amoeba commented 6 years ago

It'd be great to have a survey of metadata standards for the user that is using our package and is new to metadata.

learithe commented 6 years ago

Publishing papers on DNA/RNA sequence data typically requires uploading the data to NCBI GenBank, which requires registering a "biosample" which contains minimal metadata requirements which vary according to sample type. Many of these are defined by the Genomics Standards Consortium, which is probably relevant to this project.

In the ecology space, metagenomic studies (eg environmental "eDNA" surveys or microbial ecology studies using 16S marker gene sequencing), there is a "minimum information about a marker gene sequence" (MIMARKS) metadeta framework developed by the Genomics Standards Consortium (Yilmaz et al. 2011 ).

The Wellcome Trust Sanger Centre recently developed a tool, EpiCollect, to easily generate metadata collection forms & associated databases which people can download to their phones to fill out in the field. It's agnostic to application (see example projects here), but was originally designed to ease metadata collection in the field for epidemiologists and ecologists. See publication

amoeba commented 6 years ago

Thanks @learithe !

I'll add to this:

magpiedin commented 6 years ago

Ditto--thanks @learithe !

The RDA list includes "MIBBI" & "Genome Metadata" but seems to be missing out on MIMARKS -- We'll ask them about it!

amoeba commented 6 years ago
magpiedin commented 6 years ago

Just getting around to adding some of those to a "Resources" section in the readme... (not sure if i botched a merge/pr? apologies if so :grimacing: )

Also -- Fairsharing.org seems to cover a nice wide set of domain standards & a bit more up-to-date/maintained than the RDA metadata directory.

They have an API if folks think it would be worth trying to do something fancy with that.

amoeba commented 6 years ago

Thanks @magpiedin for your PR (merged in 2711331c824377aa876f8b7565a8970ef14542d9). Do we wanna close this Issue for now?

magpiedin commented 6 years ago

That sounds wise -- & thanks @amoeba for keeping stuff focused!

annakrystalli commented 6 years ago

Just been chatting to someone about High-Throughput Assays And Experimental Metadata and they have their own class in R to contain resulting data called eSets. Just adding it here for now but thought it an interesting approach to be aware of.