Closed amoeba closed 6 years ago
Publishing papers on DNA/RNA sequence data typically requires uploading the data to NCBI GenBank, which requires registering a "biosample" which contains minimal metadata requirements which vary according to sample type. Many of these are defined by the Genomics Standards Consortium, which is probably relevant to this project.
In the ecology space, metagenomic studies (eg environmental "eDNA" surveys or microbial ecology studies using 16S marker gene sequencing), there is a "minimum information about a marker gene sequence" (MIMARKS) metadeta framework developed by the Genomics Standards Consortium (Yilmaz et al. 2011 ).
The Wellcome Trust Sanger Centre recently developed a tool, EpiCollect, to easily generate metadata collection forms & associated databases which people can download to their phones to fill out in the field. It's agnostic to application (see example projects here), but was originally designed to ease metadata collection in the field for epidemiologists and ecologists. See publication
Just getting around to adding some of those to a "Resources" section in the readme... (not sure if i botched a merge/pr? apologies if so :grimacing: )
Also -- Fairsharing.org seems to cover a nice wide set of domain standards & a bit more up-to-date/maintained than the RDA metadata directory.
They have an API if folks think it would be worth trying to do something fancy with that.
Thanks @magpiedin for your PR (merged in 2711331c824377aa876f8b7565a8970ef14542d9). Do we wanna close this Issue for now?
That sounds wise -- & thanks @amoeba for keeping stuff focused!
Just been chatting to someone about High-Throughput Assays And Experimental Metadata and they have their own class in R to contain resulting data called eSets. Just adding it here for now but thought it an interesting approach to be aware of.
It'd be great to have a survey of metadata standards for the user that is using our package and is new to metadata.