Sage-Bionetworks / dccvalidator

Metadata Validation for Data Coordinating Centers
https://sage-bionetworks.github.io/dccvalidator/
Other
9 stars 11 forks source link

additional metadata variables #305

Open amapeters opened 4 years ago

amapeters commented 4 years ago

I would like to have a discussion about what process to use where the data contributor has additional variables than what we ask for in the metadata templates.

I anticipate this will mostly be an issue with the 'individual' data. We do not want to discourage people from including more clinical and demographic variables than the minimal set requires. For AMP-AD we have messaged people that they should add additional variables available to the individual file (after the minimal set) and provide a data dictionary, but we have not provided instructions for how to do this (ie a separate pdf, csv...) Suggestions?

This is more of an issue with Alzheimer's than the neuropsychiatric data since there are many AD diagnostic tests that are not limited to the pathology measurements (Braak and CERAD) which we ask for. We do not want to hinder people from providing more data regarding that if they have collected it.

This may not be an issue for PsychECNODE since the donors are mostly brain bank donations and Dx is based on DSM, but may be something we should keep in mind

For specimen and assay metadata I think we should encourage people to tell us if there are additional variables they think should be included, and evaluate if those variables should be made part of the template, but not let people add them as additional variables to the file.

karawoo commented 4 years ago

For AMP-AD we have messaged people that they should add additional variables available to the individual file (after the minimal set) and provide a data dictionary, but we have not provided instructions for how to do this (ie a separate pdf, csv...) Suggestions?

This could be another file upload in the dccvalidator app. We should decide how we want to make that information available in the portal; in the past, data dictionaries have lived as wikis on files in Synapse, which doesn't seem ideal. How we want to surface the information may dictate what type of format we request they share it in.

For specimen and assay metadata I think we should encourage people to tell us if there are additional variables they think should be included, and evaluate if those variables should be made part of the template, but not let people add them as additional variables to the file.

Why not?