WGLab / CancerVar

Clinical interpretation of somatic mutations in cancer
42 stars 13 forks source link

difference between CancerVar and VIC? #6

Open argoat opened 3 years ago

quanliustc commented 3 years ago

There are some difference between CancerVar and VIC. VIC is my first version. CancerVar optimized the score strategy, also the datasets are bigger and updated.

argoat commented 3 years ago

thanks. So i tried both CancerVar and VIC for 10 input variants. Time consuming varied greatly. CancerVar took 42 mins while VIC took only 6 mins. Consistent reference datasets used for ANNOVAR step. Results are pasted as below.

Chr Start End Ref Alt Ref.Gene clinvar: Clinvar CancerVar: CancerVar and Evidence VIC: VIC and Evidence
chr2 47601106 47601106 T C EPCAM clinvar: Benign/Likely_benign CancerVar: 0#Benign/Likely_benign EVS=[0, 0, 0, 1, 0, 0, 0, -1, 1, -1, 0, 0] VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 0, 1, 1, 0, 0, NONE]
chr2 48027958 48027958 G T MSH6 clinvar: Pathogenic CancerVar: 7#Uncertain_significance EVS=[0, 1, 0, 1, 0, 0, 0, 1, 2, 0, 1, 1] VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE]
chr2 48027979 48027979 G A MSH6 clinvar: Conflicting_interpretations_of_pathogenicity CancerVar: 6#Uncertain_significance EVS=[0, 1, 0, 1, 0, 0, 1, 1, 1, -1, 1, 1] VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 0, 1, 0, 2, NONE]
chr2 48033753 48033753 G T MSH6 clinvar: Pathogenic CancerVar: 7#Uncertain_significance EVS=[0, 1, 0, 1, 0, 0, 0, 1, 2, 0, 1, 1] VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE]
chr3 1.79E+08 1.79E+08 A G PIK3CA clinvar: Pathogenic CancerVar: 7#Uncertain_significance EVS=[1, 0, 1, 1, 0, 0, 0, 1, 2, -1, 1, 1] VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 2, 0, 2, NONE]
chr10 89720852 89720852 C T PTEN clinvar: Pathogenic CancerVar: 8#Likely_pathogenic EVS=[1, 1, 0, 1, 0, 0, 0, 1, 2, 0, 1, 1] VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE]
chr12 25380283 25380283 C A KRAS clinvar: Likely_pathogenic CancerVar: 8#Likely_pathogenic EVS=[1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1] VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 1, 2, 2, NONE]
chr12 1.33E+08 1.33E+08 A G POLE clinvar: UNK CancerVar: 6#Uncertain_significance EVS=[1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1] VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 0, 1, 2, 2, NONE]
chr17 7574003 7574003 G A TP53 clinvar: Pathogenic CancerVar: 8#Likely_pathogenic EVS=[1, 0, 1, 1, 0, 0, 0, 1, 2, 0, 1, 1] VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE]
chr17 7578550 7578550 G T TP53 clinvar: Uncertain_significance CancerVar: 9#Likely_pathogenic EVS=[1, 0, 1, 1, 0, 0, 0, 0, 2, 2, 1, 1] VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 0, 2, 2, 2, NONE]

Additionally, i noticed that the latest update time for VIC is 3 months ago and CancerVar is 6 months. Are these two programs both under active developing? Which one preferred?

quanliustc commented 3 years ago

That's not normal, not sure your running environment , normally in linux, cancervar can quickly process 10 variants < 3 mins in single thread, 40 mins seems not right, for example I am using python2.7 and one Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz.

argoat commented 3 years ago

I'm using Python 2.7.5 and 32 Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz.

Running info: Notice: Your command of CancerVar is ['CancerVar.py', '-c', 'config.ini'] Warning: Your specified evidence file [ None ], the analysis will take your additional evidence. INFO: The options are {'table_annovar': '/data/apps/annovar-20180426/table_annovar.pl', 'exclude_snps': 'cancervardb/ext.variants.hg19', 'annotate_variation': '/data/apps/annovar-20180426/annotate_variation.pl', 'current_version': 'CancerVar_20200119', 'evidence_file': 'None', 'public_dev': 'https://github.com/WGLab/CancerVar/releases', 'otherinfo': 'TRUE', 'database_names': 'refGene esp6500siv2_all 1000g2015aug avsnp150 dbnsfp35a clinvar_20200316 exac03 dbscsnv11 dbnsfp31a_interpro ensGene knownGene cosmic89_coding icgc21 gnomad211_genome', 'mim_pheno': 'cancervardb/mim_pheno.txt', 'cancer_pathway': 'cancervardb/cancers_genes.list_kegg.txt', 'cancers_types': 'cancervardb/cancervar.cancer.types', 'buildver': 'hg19', 'onetranscript': 'FALSE', 'mim2gene': 'cancervardb/mim2gene.txt', 'orpha': 'cancervardb/orpha.txt', 'inputfile_type': 'AVinput', 'knowngenecanonical': 'cancervardb/knownGeneCanonical.txt', 'cancers_genes': 'cancervardb/cancer_census.genes', 'convert2annovar': '/data/apps/annovar-20180426/convert2annovar.pl', 'database_locat': '/data/server_data/database/humandb', 'database_cancervar': 'cancervardb', 'lof_genes': 'cancervardb/LOF.genes.exac_me_cancers', 'cancervar_markers': 'cancervardb/cancervar.out.txt', 'outfile': 'testp1/myanno', 'disorder_cutoff': '0.01', 'mim_orpha': 'cancervardb/mim_orpha.txt', 'inputfile': 'testp1/testp1.avinput'} Warning: the folder of /data/server_data/database/humandb is already created! perl /data/apps/annovar-20180426/table_annovar.pl testp1/testp1.avinput /data/server_data/database/humandb -buildver hg19 -remove -out testp1/myanno -protocol refGene,ensGene,knownGene,esp6500siv2_all,1000g2015aug_all,exac03,avsnp150,dbnsfp35a,dbscsnv11,dbnsfp31a_interpro,clinvar_20200316,cosmic89_coding,icgc21,gnomad211_genome -operation g,g,g,f,f,f,f,f,f,f,f,f,f,f -nastring . --otherinfo

argoat commented 3 years ago

most of the time spent on gnomad211_genome database when performing ANNOVAR