Open argoat opened 3 years ago
thanks. So i tried both CancerVar and VIC for 10 input variants. Time consuming varied greatly. CancerVar took 42 mins while VIC took only 6 mins. Consistent reference datasets used for ANNOVAR step. Results are pasted as below.
Chr | Start | End | Ref | Alt | Ref.Gene | clinvar: Clinvar | CancerVar: CancerVar and Evidence | VIC: VIC and Evidence |
---|---|---|---|---|---|---|---|---|
chr2 | 47601106 | 47601106 | T | C | EPCAM | clinvar: Benign/Likely_benign | CancerVar: 0#Benign/Likely_benign EVS=[0, 0, 0, 1, 0, 0, 0, -1, 1, -1, 0, 0] | VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 0, 1, 1, 0, 0, NONE] |
chr2 | 48027958 | 48027958 | G | T | MSH6 | clinvar: Pathogenic | CancerVar: 7#Uncertain_significance EVS=[0, 1, 0, 1, 0, 0, 0, 1, 2, 0, 1, 1] | VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE] |
chr2 | 48027979 | 48027979 | G | A | MSH6 | clinvar: Conflicting_interpretations_of_pathogenicity | CancerVar: 6#Uncertain_significance EVS=[0, 1, 0, 1, 0, 0, 1, 1, 1, -1, 1, 1] | VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 0, 1, 0, 2, NONE] |
chr2 | 48033753 | 48033753 | G | T | MSH6 | clinvar: Pathogenic | CancerVar: 7#Uncertain_significance EVS=[0, 1, 0, 1, 0, 0, 0, 1, 2, 0, 1, 1] | VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE] |
chr3 | 1.79E+08 | 1.79E+08 | A | G | PIK3CA | clinvar: Pathogenic | CancerVar: 7#Uncertain_significance EVS=[1, 0, 1, 1, 0, 0, 0, 1, 2, -1, 1, 1] | VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 2, 0, 2, NONE] |
chr10 | 89720852 | 89720852 | C | T | PTEN | clinvar: Pathogenic | CancerVar: 8#Likely_pathogenic EVS=[1, 1, 0, 1, 0, 0, 0, 1, 2, 0, 1, 1] | VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE] |
chr12 | 25380283 | 25380283 | C | A | KRAS | clinvar: Likely_pathogenic | CancerVar: 8#Likely_pathogenic EVS=[1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1] | VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 1, 2, 2, NONE] |
chr12 | 1.33E+08 | 1.33E+08 | A | G | POLE | clinvar: UNK | CancerVar: 6#Uncertain_significance EVS=[1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1] | VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 0, 1, 2, 2, NONE] |
chr17 | 7574003 | 7574003 | G | A | TP53 | clinvar: Pathogenic | CancerVar: 8#Likely_pathogenic EVS=[1, 0, 1, 1, 0, 0, 0, 1, 2, 0, 1, 1] | VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE] |
chr17 | 7578550 | 7578550 | G | T | TP53 | clinvar: Uncertain_significance | CancerVar: 9#Likely_pathogenic EVS=[1, 0, 1, 1, 0, 0, 0, 0, 2, 2, 1, 1] | VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 0, 2, 2, 2, NONE] |
Additionally, i noticed that the latest update time for VIC is 3 months ago and CancerVar is 6 months. Are these two programs both under active developing? Which one preferred?
That's not normal, not sure your running environment , normally in linux, cancervar can quickly process 10 variants < 3 mins in single thread, 40 mins seems not right, for example I am using python2.7 and one Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz.
I'm using Python 2.7.5 and 32 Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz.
Running info: Notice: Your command of CancerVar is ['CancerVar.py', '-c', 'config.ini'] Warning: Your specified evidence file [ None ], the analysis will take your additional evidence. INFO: The options are {'table_annovar': '/data/apps/annovar-20180426/table_annovar.pl', 'exclude_snps': 'cancervardb/ext.variants.hg19', 'annotate_variation': '/data/apps/annovar-20180426/annotate_variation.pl', 'current_version': 'CancerVar_20200119', 'evidence_file': 'None', 'public_dev': 'https://github.com/WGLab/CancerVar/releases', 'otherinfo': 'TRUE', 'database_names': 'refGene esp6500siv2_all 1000g2015aug avsnp150 dbnsfp35a clinvar_20200316 exac03 dbscsnv11 dbnsfp31a_interpro ensGene knownGene cosmic89_coding icgc21 gnomad211_genome', 'mim_pheno': 'cancervardb/mim_pheno.txt', 'cancer_pathway': 'cancervardb/cancers_genes.list_kegg.txt', 'cancers_types': 'cancervardb/cancervar.cancer.types', 'buildver': 'hg19', 'onetranscript': 'FALSE', 'mim2gene': 'cancervardb/mim2gene.txt', 'orpha': 'cancervardb/orpha.txt', 'inputfile_type': 'AVinput', 'knowngenecanonical': 'cancervardb/knownGeneCanonical.txt', 'cancers_genes': 'cancervardb/cancer_census.genes', 'convert2annovar': '/data/apps/annovar-20180426/convert2annovar.pl', 'database_locat': '/data/server_data/database/humandb', 'database_cancervar': 'cancervardb', 'lof_genes': 'cancervardb/LOF.genes.exac_me_cancers', 'cancervar_markers': 'cancervardb/cancervar.out.txt', 'outfile': 'testp1/myanno', 'disorder_cutoff': '0.01', 'mim_orpha': 'cancervardb/mim_orpha.txt', 'inputfile': 'testp1/testp1.avinput'} Warning: the folder of /data/server_data/database/humandb is already created! perl /data/apps/annovar-20180426/table_annovar.pl testp1/testp1.avinput /data/server_data/database/humandb -buildver hg19 -remove -out testp1/myanno -protocol refGene,ensGene,knownGene,esp6500siv2_all,1000g2015aug_all,exac03,avsnp150,dbnsfp35a,dbscsnv11,dbnsfp31a_interpro,clinvar_20200316,cosmic89_coding,icgc21,gnomad211_genome -operation g,g,g,f,f,f,f,f,f,f,f,f,f,f -nastring . --otherinfo
most of the time spent on gnomad211_genome database when performing ANNOVAR
There are some difference between CancerVar and VIC. VIC is my first version. CancerVar optimized the score strategy, also the datasets are bigger and updated.