Closed selinad closed 8 years ago
@jimmyzhen As we discussed this morning, some of the predictors (FATHMM, PROVEAN, SIFT and MutationTaster) return multiple results (one for each transcript). We need to show all the results per resource and not just one.
@wrightmw Can you confirm that we are not using the rankscore
for the predictors (e.g. FATHMM, PROVEAN) but the multiple scores when they are available (see example below)?
fathmm: {
pred: [
"D",
".",
".",
"D",
"D",
"D",
"D",
"D",
"D"
],
rankscore: 0.89706,
score: [
-2.45,
null,
null,
-2.45,
-2.57,
-2.55,
-2.45,
-2.45,
-2.46
]
},
Then what about predictors that have multiple scores but with a single prediction (see example below)?
polyphen2: {
hdiv: {
pred: "B",
rankscore: 0.28728,
score: [
0.225,
0.012,
0.001
]
},
hvar: {
pred: "B",
rankscore: 0.26475,
score: [
0.071,
0.008,
0.024
]
}
},
@selinad,
In regards to your comments upon testing the instance, I have addressed the following as of today:
@jimmyzhen We have discussed the predictors data in person but I just wanted to also answer your questions in the ticket:
This spreadsheet contains the expected output for one variant (http://www.ncbi.nlm.nih.gov/clinvar/variation/55847/): CompTestSample.xlsx
*ights reviewed together with @jimmyzhen - looks great!
Included in last release (R7alpha1). Nice job and thanks for your hard work.
The most current wireframe for the Computational tab has been uploaded to Asana (Computational-tab_6-15-2016.2.pptx). This ticket will provide further specification of this tab page.
Computational Tools
There are 3 sets of Computational information we need to pull in - below Protein Predictors This information will be pulled in from myvariant.info - there are 17 fields with scores (listed on wireframe)- we need the Source, the value and any call they make about it (e.g. pathogenic or deleterious)
Conservation Analysis This information will come from myvariant.info as well - there are 5 fields that should be pulled in (listed on wireframe) - I believe we need to display the same fields as for the Protein Predictors, but we can confirm.
Splicing Predictors @wrightmw and I need to figure out how to get this data - for starters, we need it to come from MaxEntScan and NNSplice.
Other Variants in Codon
For this, we need to search ClinVar for the genomic location of the variant + 2 nt on either side of the variant - @wrightmw do you know how to do this search? (I have also sent Steven a message) - the last column is for the ID of the variant from the source (e.g. ClinVar VariationID, CA ID, etc.) We also need to allow them a way to add a variant to this table and be the source for it if necessary - this added to wireframe. Will involve storing a couple of curated fields. Note: Please see paper by Steven Harrison in Asana (Using_ClinVar_Current_Protocols.pdf)
Repetitive Regions
For now, we are just going to link to the UCSC and Variation Viewer browsers using the chromosomal location of the variant and a range that encompasses 30 nt on either side of the variant. We will also link to ExAC at the chromosomal position for the variant (with the change specified).
@wrightmw please review and fill in anything that I've missed or needs editing.