Open davemcg opened 8 years ago
ok. I'll have a look. You can always use sql:
AND gene in ($genes)
where $genes can be created in python with ",".join("'%s'" % g for g in genes)
Ah, that's cleaner
acmg_genes=\'ACTA2\',\'ACTC1\',\'APC\',\'APOB\',\'BRCA1\',\'BRCA2\',\'CACNA1S\',\'COL3A1\',\'DSC2\',\'DSG2\',\'DSP\',\'FBN1\',\'GLA\',\'KCNH2\',\'KCNQ1\',\'LDLR\',\'LMNA\',\'MEN1\',\'MLH1\',\'MSH2\',\'MSH6\',\'MUTYH\',\'MYBPC3\',\'MYH11\',\'MYH7\',\'MYL2\',\'MYL3\',\'MYLK\',\'NF2\',\'PCSK9\',\'PKP2\',\'PMS2\',\'PRKAG2\',\'PTEN\',\'RB1\',\'RET\',\'RYR1\',\'RYR2\',\'SCN5A\',\'SDHAF2\',\'SDHB\',\'SDHC\',\'SDHD\',\'SMAD3\',\'STK11\',\'TGFBR1\',\'TGFBR2\',\'TMEM43\',\'TNNI3\',\'TNNT2\',\'TP53\',\'TPM1\',\'TSC1\',\'TSC2\',\'VHL\',\'WT1\'
family_id='your_fam_id'
gemini query --header -q "SELECT chrom, start, gene, clinvar_sig, impact, impact_severity, max_aaf_all, (gts).(family_id=='$family_id') FROM variants WHERE \
(gene IN ($acmg_genes) ) \
AND \
(clinvar_sig LIKE '%pathogenic%' OR impact_severity='HIGH') \
AND \
max_aaf_all < 0.005" \
--gt-filter "(gt_types).(family_id=='$family_id').(!=HOM_REF).(count>=1)" \
gemini.db | less -S
Clinical centers are recommended to check ACMG's incidental finding list (http://www.ncbi.nlm.nih.gov/clinvar/docs/acmg/). It would be useful if Gemini facilitated this kind of query by either building a "Built-in analysis tool" for this or by adding a column to variants table indicating whether a gene was in the list (though only 56 currently have this designation).
I'm having some trouble figuring out what people are actually doing to parse their variants for this purpose, but I think doing the following query makes sense to filter for the genetic counselor.
Pseudo command:
Here's an actual functioning command which is trio-based, since that's how most of our exomes are setup:
(In my database, I did do quick check with sort and uniq and my database (0.18) does recognize all the 56 genes.)
If you just want the ACMG list (parsed from http://www.ncbi.nlm.nih.gov/clinvar/docs/acmg/), here it is: