Closed lucasgautheron closed 2 years ago
this sounds great, and I second the solution.
I also wonder whether we want to add something nobody includes but will be increasingly necessary, I think: the proof of ethical permission for the data collection & sharing, and a sample consent form. That will definitely not be machine-readable for now.
How about contact information for the authors? For EL1000, see table under this header. Author contact info should stay with the data, and can change too (eg if someone retires)
Regarding authorship, can we use this format ? https://gin.g-node.org/G-Node/Info/wiki/DOIfile#creating-a-datacite-metadata-file
On GIN, once this file has been created, informations will show at the bottom of the repository main page, see here for instance: https://gin.g-node.org/LAAC-LSCP/managing-storing-sharing-paper
Is your feature request related to a problem? Please describe.
Datasets always need to be documented. Documentation may include information about:
Ideally, some of the documentation should be machine-readable in order to improve discoverability. Machine-readability may also be exploited by DataLad's metadata extractors.
For instance, GIN uses the datacite scheme (using YAML), which is used to generate the DOI and the metadata associated to it: https://gin.g-node.org/G-Node/Info/wiki/DOIfile#creating-a-datacite-metadata-file.
The variables can also be documented using machine-readable formats. The most obvious candidates are CSV, YAML, or XML. However, it is likely that some of these information won't fit in rigid structures. We should encourage people to use formats such as Markdown rather than docx maybe for such information...
Describe the solution you'd like