Open mih opened 5 years ago
I would be more than happy to try to help where ever I can.
Here is a dump of ideas for what I might have found useful if I get to some dataset I don't know.
transfer alerted me to this nice old idea from @mih . I think the issue might move but not deprecate! eventually we should arrive at such metadata representation. I wonder though if we should may be take some specific domain/standard first to see what is missing. E.g. BIDS and neuroimaging data.
Here is a reproducible analysis made by @adswa using open source tools and public data: https://github.com/adswa/multimatch_forrest
Pretty much all steps were captured via datalad, but the presentation is suboptimal, because it is solely based on a README that does not contain much information.
We should be able to compose suitable README content from the dataset (and its metadata) for such an analysis:
Here are a few more datasets that "look" better, but the looks are also just based on a manually composed README:
Of course not everything can be inferred and automated, but being able to generate valid and informative description snippets would substantiallu lower the bar for having nice READMEs.