ComPlat / chemotion_REPO

Repository for samples, reactions and related research data
https://www.chemotion-repository.net
GNU Affero General Public License v3.0
12 stars 2 forks source link

downloaded datasets miss their metadata #32

Open tilfischer opened 1 year ago

tilfischer commented 1 year ago

Downloaded analytical datasets also contain a dataset_description.txt, which provides information on the dataset name, instrument, description and a list of files within this dataset including their checksum. The metadata of the sample, however, needs to be downloaded manually, but is not part of the downloaded ZIP file.

As an enhancement request I would suggest to always add the sample's metadata to the downloaded ZIP file of a dataset (e.g. 1H NMR spectroscopic data) by using BagIt. Beside of metadata in XML format, possibly also a rendered version as e.g. HTML file would be handy for human readers. Having the metadata with the downloaded dataset(s) would also link one dataset downloaded with its analytical data to other datasets with other analytical data of the same sample, as the related datasets are listed in the sample metadata as related Identifiers already.

If the datasets would contain the metadata in XML format (and possibly rendered as HTML for enhanced human readability), the dataset_description.txt could be omitted and the checksums could be listed in a separate text file.

nicolejung commented 1 year ago

needs discussion on format but the point is right

tilfischer commented 1 year ago

People to get in contact on this from other NFID4Chem repositories are RADAR(4Chem) and nmrXiv people. Both already implemented or will implement BagIt.

Edit: Connected to: https://github.com/ComPlat/chemotion_REPO/issues/10

tilfischer commented 6 months ago

needs discussion on format but the point is right

BagIt is a good starting point, as RO-Crates could be added to BagIt at a later point in time see https://www.researchobject.org/ro-crate/1.1/appendix/implementation-notes.html#adding-ro-crate-to-bagit

Best, Tillmann