ding-lab / CharGer

Characterization of Germline variants
https://ding-lab.github.io/CharGer/
GNU General Public License v3.0
97 stars 37 forks source link

I used CharGer. But I got different results. #20

Closed oghzzang closed 5 years ago

oghzzang commented 5 years ago

Hello. I'm CharGer user. I appreciate your helpful program.

In this paper (Cell, 2018, 173, 355-370), this BRCA2 variant (Supplement xlsx file 2A, 13:g.32890660A>G) is one of pathogenic and rare variants (Charger score; 11, PM2, PM5, PS1). But, in my results, this variant's CharGer score is only 2 (only PM2). So in my results, this variant is not called pathogenic or likely pathogenic.

And, in your Supplementary table, Charger score of ATM variant (11:g.108129749C>T) is 17 (PS1+PVS1+PM2). But in my result, this variant's CharGer score is only 6 (PM2 + PSC1).

Could you point out my faults? I want to get the same results as you. I think this difference results from "emptyRemoved_20160428_pathogenic_variants_HGVSg_VEP.vcf". This file has only variants on following chromosomes; chr10, chr13, chr17, chr5.

I added my scripts.

Thanks

Oh.


cf. my input vcf : varscan2 vcf --> annotation by vep (v94, ref: GRCh37, Exac ; nonTCGA version r1)

mmGenes=$PanCanAtlasData/20160301_Rahman_KJ_KH_gene_table_CharGer.txt mmVariants=$PanCanAtlasData/emptyRemoved_20160428_pathogenic_variants_HGVSg_VEP.vcf hotspot=$PanCanAtlasData/MC3.noHypers.mericUnspecified.d10.r20.v114.clusters clinvar=$PanCanAtlasData/clinvar_alleles.single.b37.tsv.gz rareThreshold="0.01" # 1% threshold commonThreshold="0.05" # 5% threshold

$bin/charger --include-vcf-details \ -f $input_dir/${Sample}.$Pair.$VC.$Class.vep.vcf \ -o $Output_dir/${Sample}.$Pair.$VC.$Class.vep.Hg19.CharGer.rare0.01.common0.05.tsv\ -O\ -D\ -g ${mmGenes}\ -z ${mmVariants}\ -H ${hotspot}\ -l\ --rare-threshold $rareThreshold\ --common-threshold $commonThreshold\ --mac-clinvar-tsv ${clinvar}

fernanda-rodrigues commented 5 years ago

Hi @stellaoh I apologize for the late response to this issue. One of the reasons why your results may differ could be the rare and common thresholds you are using. For the paper you're referring to, a --rare-threshold of 0.0005 (0.05%) and a --common-threshold of 0.005 (0.5%) were used. Additionally, it is possible you're using a different version of ClinVar than it was used then. As databases are updated, these classifications may change a bit. However, I cannot be sure this is the underlying cause since I am not looking at your files.

Please do not hesitate to ask for help. For more questions specific to the PanCan Germline study, you can also contact Dr. Kuan Huang (kuan-lin.huang@mssm.edu). He was the student responsible for the analysis.

fernanda-rodrigues commented 5 years ago

I am closing this issue for now. Please do not hesitate to ask any further questions you may have.