ontology: study_inHouseData

AndrewSchork commented 4 years ago

construct an ontology for study_inHouseData

AndrewSchork commented 4 years ago

https://docs.google.com/spreadsheets/d/1JowLmxixDu7oDYDtG984UZNU8HOxlFOZJq8bHD8jJ4E/edit?usp=sharing

review and make suggestions

rzetterberg commented 3 years ago

When I implemented #112 I added the available values from the document your provided, so we already have support for checking that study_inHouseData only can contain the allowed values.

Here's what the part of the schema looks like for this field:

  study_inHouseData:
    description: |
      If iPSYCH data, UKBiobank, or some other in house data set that we analyze,
      is in this study then this is very important to mark.
      Consider checking PMID in external inventories.
      List of studies to watch out for is provided in the ontology doc.
      - Ontology: https://docs.google.com/spreadsheets/d/1qghudJelGssaTbe8CDAOHOk7fhpyDAwEKGkOBMqGb3M/
      - External inventories: https://docs.google.com/spreadsheets/d/1NtSyTscFL6lI5gQ_00bm0reoT6yS2tDB3SHhgM7WwSE/
    type: "string"
    enum:
      - "none"
      - "iPSYCH2012"
      - "iPSYCH2015"
      - "UKB"
      - "GEMS"

The list of values in the enum property is the values that are allowed. Also note that instead of using the value "missing", you just don't add the study_inHouseData field to the metadata-file.

pappewaio commented 3 years ago

I think the enum looks good, but I think we should rename it from study_inHouseData to study_includedCohorts. If we release the pipeline, then in-house data might not be appropriate. @AndrewSchork, what do you think?

AndrewSchork commented 3 years ago

Seems logical. go for it

rzetterberg commented 3 years ago

Alright, I'll make it happen.

rzetterberg commented 3 years ago

PR ready for review: #133

BioPsyk / cleansumstats

ontology: study_inHouseData #80