WGLab / InterVar

A bioinformatics software tool for clinical interpretation of genetic variants by the 2015 ACMG-AMP guideline
189 stars 92 forks source link

PM1 scoring #25

Open rjsicko opened 6 years ago

rjsicko commented 6 years ago

I'm trying to figure out why PM1 is scored as 1 for a variant. The variant is: (hg19) 11:36615436C>T - RAG2 NM_001243786:exon3:c.G283A:p.G95R

I checked the PM1 check function:

def check_PM1(line,Funcanno_flgs,Allels_flgs,domain_benign_dict):
    '''
    Located in a mutational hot spot and/or critical and well-established functional domain (e.g., active site of
    an enzyme) without benign variation
    '''
    PM1=0
    PM1_t1=0
    PM1_t2=0
    cls=line.split('\t')
    funcs_tmp=["missense","nonsynony"]
    line_tmp=cls[Funcanno_flgs['Func.refGene']]+" "+cls[Funcanno_flgs['ExonicFunc.refGene']]
    for fc in funcs_tmp:
        if line_tmp.find(fc)>=0 :
            PM1_t1=1;
        # need to wait to check whether in hot spot  or  functional domain/without benign variation
    if cls[Funcanno_flgs['Interpro_domain']]!= '.' :
        keys_tmp2=cls[Allels_flgs['Chr']]+"_"+cls[Funcanno_flgs['Gene']]+": "+cls[Funcanno_flgs['Interpro_domain']]
        try:
            if domain_benign_dict[keys_tmp2] =="1":
                PM1_t2=0
        except KeyError:
            PM1_t2=1
        else:
            pass

    if PM1_t1==1 and PM1_t2==1 :
        PM1=1

    return(PM1)

If I'm interpreting the code correctly, PM_t1 checks to make sure there's at least one one exonic missense annotation, then if 'Interpro_domain' has any annotations check that they are not domains with benign variants.

The example variant has the following domain in PM1_domains_with_benigns.hg19 : AG2 Galactose oxidase/kelch, beta-propeller

Interpro_domains annotated by annovar:  Galactose oxidase/kelch, beta-propeller;Kelch-type beta propeller

is it the Kelch-type beta propeller domain that is causing PM1_t2 to be 1? If so, I'm not sure it should... https://www.ebi.ac.uk/interpro/protein/P55895

both domains overlap almost completely. Can you provide any information on how you generate PM1_domains_with_benigns.hg19 ?

Thanks, Bob

quanliustc commented 6 years ago

The issue is coming from the annovar's domain name, for RAG2, there are three sets of name : 9 Galactose oxidase/kelch, beta-propeller 14 Galactose oxidase/kelch, beta-propeller;Kelch-type beta propeller 5 Recombination activating protein 2, PHD domain;Zinc finger, FYVE/PHD-type When we build the PM1 dataset, the pipeline, thought the first and second is different, but actually it is the same. so, I think you point is right, we should remove the PM1 benign for RAG2 Thanks!

rjsicko commented 6 years ago

Is PM1_domains_with_benigns.hg19 domains with benign variants in reference population databases? If so, we should add the second annovar annotation 14 Galactose oxidase/kelch, beta-propeller;Kelch-type beta propeller to that file right?

Thanks, Bob

quanliustc commented 6 years ago

Yes. this file is the domain with benign or common variants(MAF>5%).

rjsicko commented 6 years ago

Then I think we need to add the second annovar annotation 14 Galactose oxidase/kelch, beta-propeller;Kelch-type beta propeller to the benign file.

Also, I think we need more checks for PM1. Currently it is checking that the variant overlaps an Interpro domain annotated by annovar and then that the domains are not on the benign list, right? I think we're missing mutational hotspots. From the original guidlines publication:

PM1 Mutational hot spot and/or critical and well-established functional domain Certain protein domains are known to be critical to protein function and all missense variants identified to date in these domains have been shown to be pathogenic. These domains must also lack benign variants. In addition, mutational hotspots in less well characterized regions of genes are reported in which pathogenic variants in one or several nearby residues have been observed with greater frequency. Either evidence can be considered moderate evidence of pathogenicity.

Elaborated on in Performance of ACMG-AMP Variant-Interpretation Guidelines among Nine Laboratories in the Clinical Sequencing Exploratory Research Consortium. supplemental info

One source of discrepancy identified that the PM1 rule (variant is located in a mutational hot spot and/or a critical and well-established functional domain), should only be invoked for missense variants, not truncations, and it should only be applied if the variant occurs in domains that are devoid of benign variation as described in more detail in the ACMG/AMP guideline. Defining ‘well-established’ also led to discordance in PM1 rule usage. The group defined a mutational hot spot as a location where there are multiple changes in the same domain that are known to be pathogenic; however, there was still disagreement regarding how many pathogenic variants constitute ‘multiple’, how well-defined the domain must be and how close other benign variants can be to the domain and variant in question

So we could add a check for known pathogenic variants from PS1.AA.change.patho.hg19 but not sure how to handle checking frequency of the pathogenic variants.