biocodellc / geome-ui

MIT License
3 stars 4 forks source link

Derived Data Extension #422

Open jdeck88 opened 4 years ago

jdeck88 commented 4 years ago

In connection with the Ira Moana Project and useful for many other efforts is a way to generically connect Tissues with downstream analytical data. This could be genotype files, OTU tables, or ASV files. We are not proposing to store the raw data but instead provide a means for linking to the derived data files for each tissue. This then, becomes a more convenient method for linking metadata with analysis pipelines, especially with regards to R package development.

Here is a list of the proposed data attributes of the derivedData Class:

Note here is that we're not trying to build processing of this data into GEOME-- rather, just collect metadata on associated files and enable linking.

liblig commented 4 years ago

Before you action any of these John, please check back in with me regarding a few vocab changes

jdeck88 commented 4 years ago

will do... i'll message you off-list to setup a meeting time.

jdeck88 commented 4 years ago

Recap from John and Libby conversation on July 8:

  1. One Idea is to create these fields as tissue data properties and call them: derivedGeneticDataFilename, derivedGeneticDataURI, etc... with Pipe delimiters for multiple attributes

  2. Another idea is to replicate other metadata on row and just vary the derived data.

The 2nd option above is more true to the GEOME model.

jdeck88 commented 3 years ago

See also: https://tools.gbif.org/dwca-validator/extension.do?id=http://rs.gbif.org/terms/1.0/DNADerivedData

https://docs.gbif-uat.org/publishing-dna-derived-data/1.0/en/