difference between CancerVar and VIC?

quanliustc commented 3 years ago

There are some difference between CancerVar and VIC. VIC is my first version. CancerVar optimized the score strategy, also the datasets are bigger and updated.

argoat commented 3 years ago

thanks. So i tried both CancerVar and VIC for 10 input variants. Time consuming varied greatly. CancerVar took 42 mins while VIC took only 6 mins. Consistent reference datasets used for ANNOVAR step. Results are pasted as below.

Chr	Start	End	Ref	Alt	Ref.Gene	clinvar: Clinvar	CancerVar: CancerVar and Evidence	VIC: VIC and Evidence
chr2	47601106	47601106	T	C	EPCAM	clinvar: Benign/Likely_benign	CancerVar: 0#Benign/Likely_benign EVS=[0, 0, 0, 1, 0, 0, 0, -1, 1, -1, 0, 0]	VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 0, 1, 1, 0, 0, NONE]
chr2	48027958	48027958	G	T	MSH6	clinvar: Pathogenic	CancerVar: 7#Uncertain_significance EVS=[0, 1, 0, 1, 0, 0, 0, 1, 2, 0, 1, 1]	VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE]
chr2	48027979	48027979	G	A	MSH6	clinvar: Conflicting_interpretations_of_pathogenicity	CancerVar: 6#Uncertain_significance EVS=[0, 1, 0, 1, 0, 0, 1, 1, 1, -1, 1, 1]	VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 0, 1, 0, 2, NONE]
chr2	48033753	48033753	G	T	MSH6	clinvar: Pathogenic	CancerVar: 7#Uncertain_significance EVS=[0, 1, 0, 1, 0, 0, 0, 1, 2, 0, 1, 1]	VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE]
chr3	1.79E+08	1.79E+08	A	G	PIK3CA	clinvar: Pathogenic	CancerVar: 7#Uncertain_significance EVS=[1, 0, 1, 1, 0, 0, 0, 1, 2, -1, 1, 1]	VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 2, 0, 2, NONE]
chr10	89720852	89720852	C	T	PTEN	clinvar: Pathogenic	CancerVar: 8#Likely_pathogenic EVS=[1, 1, 0, 1, 0, 0, 0, 1, 2, 0, 1, 1]	VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE]
chr12	25380283	25380283	C	A	KRAS	clinvar: Likely_pathogenic	CancerVar: 8#Likely_pathogenic EVS=[1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1]	VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 1, 2, 2, NONE]
chr12	1.33E+08	1.33E+08	A	G	POLE	clinvar: UNK	CancerVar: 6#Uncertain_significance EVS=[1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1]	VIC: Uncertain significance EVS=[0, 1, NONE, NONE, 1, 0, 1, 2, 2, NONE]
chr17	7574003	7574003	G	A	TP53	clinvar: Pathogenic	CancerVar: 8#Likely_pathogenic EVS=[1, 0, 1, 1, 0, 0, 0, 1, 2, 0, 1, 1]	VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 2, 2, 1, 2, NONE]
chr17	7578550	7578550	G	T	TP53	clinvar: Uncertain_significance	CancerVar: 9#Likely_pathogenic EVS=[1, 0, 1, 1, 0, 0, 0, 0, 2, 2, 1, 1]	VIC: Potential clinical significance EVS=[1, 1, NONE, NONE, 1, 0, 2, 2, 2, NONE]

Additionally, i noticed that the latest update time for VIC is 3 months ago and CancerVar is 6 months. Are these two programs both under active developing? Which one preferred?

quanliustc commented 3 years ago

That's not normal, not sure your running environment , normally in linux, cancervar can quickly process 10 variants < 3 mins in single thread, 40 mins seems not right, for example I am using python2.7 and one Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz.

argoat commented 3 years ago

I'm using Python 2.7.5 and 32 Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz.

Running info: Notice: Your command of CancerVar is ['CancerVar.py', '-c', 'config.ini'] Warning: Your specified evidence file [ None ], the analysis will take your additional evidence. INFO: The options are {'table_annovar': '/data/apps/annovar-20180426/table_annovar.pl', 'exclude_snps': 'cancervardb/ext.variants.hg19', 'annotate_variation': '/data/apps/annovar-20180426/annotate_variation.pl', 'current_version': 'CancerVar_20200119', 'evidence_file': 'None', 'public_dev': 'https://github.com/WGLab/CancerVar/releases', 'otherinfo': 'TRUE', 'database_names': 'refGene esp6500siv2_all 1000g2015aug avsnp150 dbnsfp35a clinvar_20200316 exac03 dbscsnv11 dbnsfp31a_interpro ensGene knownGene cosmic89_coding icgc21 gnomad211_genome', 'mim_pheno': 'cancervardb/mim_pheno.txt', 'cancer_pathway': 'cancervardb/cancers_genes.list_kegg.txt', 'cancers_types': 'cancervardb/cancervar.cancer.types', 'buildver': 'hg19', 'onetranscript': 'FALSE', 'mim2gene': 'cancervardb/mim2gene.txt', 'orpha': 'cancervardb/orpha.txt', 'inputfile_type': 'AVinput', 'knowngenecanonical': 'cancervardb/knownGeneCanonical.txt', 'cancers_genes': 'cancervardb/cancer_census.genes', 'convert2annovar': '/data/apps/annovar-20180426/convert2annovar.pl', 'database_locat': '/data/server_data/database/humandb', 'database_cancervar': 'cancervardb', 'lof_genes': 'cancervardb/LOF.genes.exac_me_cancers', 'cancervar_markers': 'cancervardb/cancervar.out.txt', 'outfile': 'testp1/myanno', 'disorder_cutoff': '0.01', 'mim_orpha': 'cancervardb/mim_orpha.txt', 'inputfile': 'testp1/testp1.avinput'} Warning: the folder of /data/server_data/database/humandb is already created! perl /data/apps/annovar-20180426/table_annovar.pl testp1/testp1.avinput /data/server_data/database/humandb -buildver hg19 -remove -out testp1/myanno -protocol refGene,ensGene,knownGene,esp6500siv2_all,1000g2015aug_all,exac03,avsnp150,dbnsfp35a,dbscsnv11,dbnsfp31a_interpro,clinvar_20200316,cosmic89_coding,icgc21,gnomad211_genome -operation g,g,g,f,f,f,f,f,f,f,f,f,f,f -nastring . --otherinfo

argoat commented 3 years ago

most of the time spent on gnomad211_genome database when performing ANNOVAR

WGLab / CancerVar

difference between CancerVar and VIC? #6