RohanAlexander / telling_stories

Telling Stories with Data
https://rohanalexander.github.io/telling_stories/
115 stars 77 forks source link

Ben - ch 12 #48

Closed RohanAlexander closed 1 year ago

RohanAlexander commented 2 years ago

The use of established repositories is indeed more formal, with some QC being done at ingress and clear egress mechanisms. This is a big deal in life sciences.

You define Personally identifying information (PII) multiple times (at least twice)

You could talk a bit more of “data provenance”, ie, documenting the origin of the data and any transformation/harmonization that took place, as well as detailed versioning. To achieve full data provenance, my team has developed ORCESTRA (https://www.orcestra.ca/); publication available here: https://www.nature.com/articles/s41467-021-25974-w. ORCESTRA only contain datasets from life sciences but there are no reasons it could not be used in other fields. I hope you will find it of intrest.

RohanAlexander commented 1 year ago

Fixed the PII thing.

Added a paragraph about data provenance.