Closed liuyang2006 closed 2 years ago
So your BAM file is a single-end file, and you sorted it by position prior to methylation extraction? This should work for single-end files, but is not recommended for paired-end data. Having said that, is there a chance that the BAM file has been perturbed somehow? Which version of Bismark are you using?
If you wanted to find out whether the sorting process somehow messed up the file, you could try to use the (deduplicated) BAM file produced by bismark
(or deduplicate_bismark
) itself.
Just FYI, if you just want to get a coverage file (and bedGraph file), --genome_folder /fastscratch/chenx/share_yang
is not needed.
Dear Felix, My BAM file is single-end file. The BAM file is not pertubed anymore. Version of bismark is:
bismark --version
Bismark - Bisulfite Mapper and Methylation Caller.
Bismark Version: v0.23.1dev
Copyright 2010-21 Felix Krueger, Babraham Bioinformatics
www.bioinformatics.babraham.ac.uk/projects/
https://github.com/FelixKrueger/Bismark
Even I remove options --genome_folder /fastscratch/chenx/share_yang
, use command bismark_methylation_extractor -s -o bismark_methylation_extractor_mm9 --gzip --bedGraph 542VTF_AzAc_12.sorted.bam
, I still got same error message:
Now reading in Bismark result file 542VTF_AzAc_12.sorted.bam
skipping SAM header line: @HD VN:1.4 SO:coordinate
skipping SAM header line: @SQ SN:chr1 LN:197195432
skipping SAM header line: @SQ SN:chr2 LN:181748087
skipping SAM header line: @SQ SN:chrX LN:166650296
skipping SAM header line: @SQ SN:chr3 LN:159599783
skipping SAM header line: @SQ SN:chr4 LN:155630120
skipping SAM header line: @SQ SN:chr5 LN:152537259
skipping SAM header line: @SQ SN:chr7 LN:152524553
skipping SAM header line: @SQ SN:chr6 LN:149517037
skipping SAM header line: @SQ SN:chr8 LN:131738871
skipping SAM header line: @SQ SN:chr10 LN:129993255
skipping SAM header line: @SQ SN:chr14 LN:125194864
skipping SAM header line: @SQ SN:chr9 LN:124076172
skipping SAM header line: @SQ SN:chr11 LN:121843856
skipping SAM header line: @SQ SN:chr12 LN:121257530
skipping SAM header line: @SQ SN:chr13 LN:120284312
skipping SAM header line: @SQ SN:chr15 LN:103494974
skipping SAM header line: @SQ SN:chr16 LN:98319150
skipping SAM header line: @SQ SN:chr17 LN:95272651
skipping SAM header line: @SQ SN:chr18 LN:90772031
skipping SAM header line: @SQ SN:chr19 LN:61342430
skipping SAM header line: @SQ SN:chrY LN:15902555
skipping SAM header line: @SQ SN:chrM LN:16299
skipping SAM header line: @PG ID:samtools PN:samtools VN:1.14 CL:/projects/li-lab/yang/anaconda3/envs/py39/bin/samtools view -h 542VTF_AzAc_12.sorted.bam
Use of uninitialized value $read_conversion in string eq at /projects/li-lab/yang/tools/Bismark/bismark_methylation_extractor line 1595, <IN> line 25.
Use of uninitialized value $read_conversion in string eq at /projects/li-lab/yang/tools/Bismark/bismark_methylation_extractor line 1598, <IN> line 25.
Use of uninitialized value $read_conversion in string eq at /projects/li-lab/yang/tools/Bismark/bismark_methylation_extractor line 1601, <IN> line 25.
Use of uninitialized value $read_conversion in string eq at /projects/li-lab/yang/tools/Bismark/bismark_methylation_extractor line 1604, <IN> line 25.
Use of uninitialized value $read_conversion in concatenation (.) or string at /projects/li-lab/yang/tools/Bismark/bismark_methylation_extractor line 1608, <IN> line 25.
Use of uninitialized value $genome_conversion in concatenation (.) or string at /projects/li-lab/yang/tools/Bismark/bismark_methylation_extractor line 1608, <IN> line 25.
Unexpected combination of read and genome conversion: '' / ''
Hmm, the file header doesn't seem to include the Bismark @PG header lines, so I wouldn't rule out that something else has happened as well.
Could you show the first lines of that file, e.g. with: samtools view -h 542VTF_AzAc_12.sorted.bam | head -30
.
Alternatively, would you be able to send me a snippet of that file, e.g. via email? You could use the following command to generate a small subset:
samtools view -h 542VTF_AzAc_12.sorted.bam | head -1000 | samtools view -bS - > small_test.bam
Please check this file: https://storage.googleapis.com/jax-nanopore-01-project-data/related_project/small_test.bam
Thanks for the file. Here are the first lines:
@PG ID:samtools PN:samtools VN:1.14 CL:samtools view -h 542VTF_AzAc_12.sorted.bam
@PG ID:samtools.1 PN:samtools PP:samtools VN:1.14 CL:samtools view -bS -
@PG ID:samtools.2 PN:samtools PP:samtools.1 VN:1.11 CL:samtools view -h small_test.bam
HWI-ST986:391:C723TACXX:1:1103:15119:49651 1:N:0:TGACCA 16 chr1 3010874 255 51M * 0 0 TCAAAACCAAAATAACTCCTACGAAACCTAAAACAAAAACCTCTTAAACCG CCCFFFFFHHHHFIJJJJHIJJJJJIJJJJIJJJJJHIIJJJJJJHIJJJJ NM:i:17 MD:Z:T1AA6AA1AA5A2A1A3A1AA2A9AAA3 XM:Z:Z..hhh.........x..hh.x...h.xZ.x.....hh.hh......xz..
As I had guessed, this BAM file is missing the read conversion (XR:Z:
) and genome conversion (XG:Z:
) optional fields, which are required to figure out where the read came from. So yes, whatever process interfered with the Bismark BAM file rendered it useless for the methylation extraction.
An easy solution would of course be to just use the (deduplicated) Bismark BAM file for the methylation extraction/coverage file generation, and use the sorted BAM file for whatever else you want to use it for.
Redo analysis on raw fastq files solved the problem!
Excellent!
Dear Felix, When I extract methylation call from BAM files, it always failed with perl error:
Command:
Error:
Would you please help me to figure out the problem? Thank you!