dnaase / Bis-tools

A collection of tools to deal with Bisulfite-seq/NOMe-seq (SNP/Methylation calling: BisSNP; HMM segmentation: DMNTools; Visualization/Clustering: alignWigToBed; )
30 stars 23 forks source link

Bis-SNP fails to detect 'CpG' on opposite strand for C with CR context #10

Open iromeo opened 4 years ago

iromeo commented 4 years ago

Bis-SNP fails to detect 'CpG' on the opposite strand for C with CR context.

Could be seen in the test data provided in your tutorial.

E.g. 7025922 is cytosine in CR context. But 7025923 position isn't marked as SNP, although we could notice from BAM file, that it is a CpG on the opposite strand, i.e. GC in read, moreover it is 100% methylated AFAIU.

$ grep "7025922" -A1 -B1 cpg.raw.vcf
chr11   7025714 .   G   .   70.87   PASS    CS=-;Context=C,CG;DP=13;MQ0=0;NS=1;REF=0    GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:35.5,NaN:12,0,0,1,0,0:12:CG:0:.:1,12,0,0:0,71,491:71:5
chr11   7025922 .   C   .   116.01  PASS    CS=+;Context=CR;DP=30;MQ0=0;NS=1;REF=C,CH   GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:33.8,30.0:0,2,0,28,0,0:0:CR:2:.:0,28,2,0:0,116,1113:99:5
chr11   7026469 .   C   .   70.87   PASS    CS=+;Context=C,CG;DP=17;MQ0=0;NS=1;REF=0    GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:33.6,35.0:2,1,0,12,0,0:2:CG:1:.:4,12,1,0:0,71,564:71:5

Even missing in 'all' SNPs, i.e with emmited all sites, w/o threshold. Also reads with CG seems pass MAPQ and base quality thresholds, so looks like bug:

$ grep "7025922" -A1 -B1 all.raw.vcf
chr11   7025921 .   G   .   43.74   PASS    CS=-;Context=C,CH;DP=32;MQ0=0;NS=1;REF=0    GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:29.5,33.4:0,28,0,4,0,0:0:CH:28:.:4,0,0,28:0,44,172:44:5
chr11   7025922 .   C   .   116.01  PASS    CS=+;Context=CR;DP=30;MQ0=0;NS=1;REF=C,CH   GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:33.8,30.0:0,2,0,28,0,0:0:CR:2:.:0,28,2,0:0,116,1113:99:5
chr11   7025924 .   A   .   37.63   PASS    DP=29;MQ0=0;NS=1    GT:BQ:BRC6:CM:CP:CU:DP:DP4:GP:GQ:SS 0/0:32.1,NaN:0,0,2,0,0,27:.:.:.:.:2,27,0,0:0,38,108:38:5

image

SNP called using:

java -Xmx4g -jar BisSNP-1.0.1.jar -T BisulfiteGenotyper -R ref/hg18_unmasked.plusContam.fa -I normalMerge_chr11-7M-9M.nodups.withhead.bam -D dbsnp_135.hg18.sort.vcf -vfn1 cpg.raw.vcf -vfn2 snp.raw.vcf -L chr11:7000000-7100000 &&  perl Utils/vcf2bed.pl cpg.raw.vcf CG &&  perl Utils/vcf2bed.pl cpg.raw.vcf CH

java -Xmx4g -jar BisSNP-1.0.1.jar -T BisulfiteGenotyper -R ref/hg18_unmasked.plusContam.fa -I normalMerge_chr11-7M-9M.nodups.withhead.bam -D dbsnp_135.hg18.sort.vcf -vfn1 all.raw.vcf -out_modes EMIT_ALL_SITES -L chr11:7000000-7100000 &&  perl Utils/vcf2bed.pl all.raw.vcf

BED/VCF files attached, see tracks.zip

weishwu commented 3 years ago

Sorry I don't know the answer to your question. I'd like to look into it if I can make bis-snp work. Currently I can't get through the recalibration step due to this error:

##### ERROR MESSAGE: Invalid command line: Malformed walker argument: Could not find walker with name: BisulfiteCountCovariates

And other users have posted the same error here, but the author seems to have abandoned this site and there is no response to any issue for several years. Could you tell me how you ran the recalibration?

Thanks,