Closed aneilbaboo closed 6 years ago
@aneilbaboo Almost all of the data from Matty's mock-up report for MTHFR has been distilled into a set of YAML and MarkDown files in the repo. There's some fine-tuning that can happen in terms of how the data is organized, but it should be enough to start having those discussions and getting code to parse them. I will start documenting the format so that Vishesh & Matty can look them over. Will they be able to access the Wiki for this repo?
Ah, I misunderstood. Here is the mock data in GA4GH JSON format: https://github.com/precisely/gene-panel-curation/tree/master/mock-user-data
These are the input files to the (Genetics) Analytics Service. We need the output of that service: JSON files suitable for loading into the Genetics Service - the yellow highlighted parts here:
See also the Database architecture document: https://docs.google.com/document/d/1E31Oted7_QN7bCbjnJN6k1X6eP9b-rFcI-uV8vjsxbg/edit
Specifically, we need to know what are the SVN variant names for the various states? What are the gene names? How will various situations be represented in the genetics service.
We need JSON that will contain values for the fields in the Genetics model:
[
{
"user_data_type_id": "{barcode-id-from-akesogen}",
"gene": "mthfr",
"source": "akesogen:genotyping",
"labAnalysisId": "...", // equivalent to the variantsetId --- identifies a particular reading
"variant": "....", // <--- need your help here
"createdAt": ...,
"updatedAt": ...
},
{
// ... another gene genotype for the same user
},
... etc
]
And here is the first iteration of the report generation JSON input: https://github.com/precisely/gene-panel-curation/blob/master/mock-user-data/report_input/MTHFR_C677T-WT_A1298C-heterozygous.json
It will be completed in the next two days (including the metadata fields described above), following design discussions tomorrow.
Initial thoughts:
Just updated the report JSON files as per format discussion with @aneilbaboo last week: https://github.com/precisely/gene-panel-curation/tree/master/mock-user-data/report_input
@aneilbaboo Regarding additional genes: GRIK3 doesn't have any phenotypes that I am aware of. Perhaps at this point the dev team can issue tickets against this repo for specific use cases, conditions, or unit tests that they need, and I can create the corresponding JSON files for them?
Issue moved to precisely/bioinformatics #9 via ZenHub
We’re going to try to stand up a report with sample user data for the current sprint (by Wednesday 21st).
We’ll need files representing a tiny subset of data that the genetics service will produce, for a handful of mock users. The goal is to provide data to exercise the mechanisms for report generation.
@visheshd ^