WGLab / InterVar

A bioinformatics software tool for clinical interpretation of genetic variants by the 2015 ACMG-AMP guideline
188 stars 93 forks source link

Reflecting settings in config.ini #60

Open fk506cni opened 3 years ago

fk506cni commented 3 years ago

Dear authors.

I know that intervar can point out the annovar db to be used internally by editing the confing.ini file, but in reality, this can only change the file to be downloaded. From here, I discovered that when annovar does the annotating work, it does not actually include the DB specified as database_names at all.

This seems to be because the check_annovar_result function in the Intervar.py script does not reflect these settings. After line 535,

if inputft.lower() == 'avinput' :
    cmd="perl "+paras['table_annovar']+" "+paras['inputfile']+" "+paras['database_locat']+" -buildver "+paras['buildver']+" -remove -out "+ paras['outfile']+" -protocol refGene,esp6500siv2_all,1000g2015aug_all,avsnp147,dbnsfp33a,clinvar_20190305,gnomad_genome,dbscsnv11,dbnsfp31a_interpro,rmsk,ensGene,knownGene  -operation  g,f,f,f,f,f,f,f,f,r,g,g   -nastring ."+annovar_options
    print("%s" %cmd)
    os.system(cmd)
if inputft.lower() == 'vcf' :
    cmd="perl "+paras['table_annovar']+" "+paras['inputfile']+".avinput "+paras['database_locat']+" -buildver "+paras['buildver']+" -remove -out "+ paras['outfile']+" -protocol refGene,esp6500siv2_all,1000g2015aug_all,avsnp147,dbnsfp33a,clinvar_20190305,gnomad_genome,dbscsnv11,dbnsfp31a_interpro,rmsk,ensGene,knownGene   -operation  g,f,f,f,f,f,f,f,f,r,g,g   -nastring ."+annovar_options
    print("%s" %cmd)
    os.system(cmd)
if inputft.lower() == 'vcf_m' :
    for f in glob.iglob(paras['outfile']+"*.avinput"): 
        print("INFO: Begin to annotate sample file of %s ...." %(f))
        new_outfile=re.sub(".avinput","",f)
        cmd="perl "+paras['table_annovar']+" "+f+" "+paras['database_locat']+" -buildver "+paras['buildver']+" -remove -out "+ new_outfile +" -protocol refGene,esp6500siv2_all,1000g2015aug_all,avsnp147,dbnsfp33a,clinvar_20190305,gnomad_genome,dbscsnv11,dbnsfp31a_interpro,rmsk,ensGene,knownGene   -operation  g,f,f,f,f,f,f,f,f,r,g,g   -nastring ."+annovar_options
        print("%s" %cmd)
        os.system(cmd)

This is because the protocols, database names, variables that are inserted into ”paras” are not reflected at all. If default DBs were downloaded in the first time, it will seem to be doing well what it wanted to do without error messages though it is using default DBs. Isn't this a bit misleading?

Is there a official way to change the DB to use without rewriting Intevar.py? I wonder how does Intervar work as a web service? It also use old annotation database?

Thank you.

quanliustc commented 3 years ago

currently, InterVar hardly code the annovar db , as annovar output some time change the column header for some updated dbs. So, the user-defined db is not safe idea as most of the users have no idea which column should be put in the analysis. User could replace in the intervar if then really know the db name and column header. Another way is I could also make a column header correspondance list, so then user need to specify two things: one is db name, the other is name of column header.

fk506cni commented 3 years ago

Thank you Dr. Qian Liu. I understood your concept.