The use of established repositories is indeed more formal, with some QC being done at ingress and clear egress mechanisms. This is a big deal in life sciences.
You define Personally identifying information (PII) multiple times (at least twice)
You could talk a bit more of “data provenance”, ie, documenting the origin of the data and any transformation/harmonization that took place, as well as detailed versioning. To achieve full data provenance, my team has developed ORCESTRA (https://www.orcestra.ca/); publication available here: https://www.nature.com/articles/s41467-021-25974-w. ORCESTRA only contain datasets from life sciences but there are no reasons it could not be used in other fields. I hope you will find it of intrest.
The use of established repositories is indeed more formal, with some QC being done at ingress and clear egress mechanisms. This is a big deal in life sciences.
You define Personally identifying information (PII) multiple times (at least twice)
You could talk a bit more of “data provenance”, ie, documenting the origin of the data and any transformation/harmonization that took place, as well as detailed versioning. To achieve full data provenance, my team has developed ORCESTRA (https://www.orcestra.ca/); publication available here: https://www.nature.com/articles/s41467-021-25974-w. ORCESTRA only contain datasets from life sciences but there are no reasons it could not be used in other fields. I hope you will find it of intrest.