FelixKrueger / Bismark

A tool to map bisulfite converted sequence reads and determine cytosine methylation states
http://felixkrueger.github.io/Bismark/
GNU General Public License v3.0
382 stars 101 forks source link

subprocess succeed but whole process failed #612

Closed AlisaGU closed 1 year ago

AlisaGU commented 1 year ago

Hi, Felix After running for 2d and 17h, my job failed. (T_T) Unfortunately, the log file was deleted out of carelessness.

I set --parallel to 2 and these two processes (maybe) succeed (report files were outputted). Can I directly merge these two bam files into one and treat it as the final bam?

Here are the files generated by bismark: image

$ cat YPX55266-S-1/YPX55266-S-1_R1.fq.gz.temp.1_bismark_bt2_PE_report.txt
Bismark report for: YPX55266-S-1_R1.fq.gz.temp.1 and YPX55266-S-1_R2.fq.gz.temp.1 (version: v0.24.1)
Bismark was run with Bowtie 2 against the bisulfite genome of /public/share/acvrjv2n2r/data/dani/ with the specified options: -q -N 1 --score-min L,0,-0.2 -p 30 --reorder --ignore-quals --no-mixed --no-discordant --dovetail --maxins 500
Option '--directional' specified (default mode): alignments to complementary strands (CTOT, CTOB) were ignored (i.e. not performed)

Final Alignment report
======================
Sequence pairs analysed in total:   193542524
Number of paired-end alignments with a unique best hit: 134399365
Mapping efficiency: 69.4% 
Sequence pairs with no alignments under any condition:  34350450
Sequence pairs did not map uniquely:    24792709
Sequence pairs which were discarded because genomic sequence could not be extracted:    168

Number of sequence pairs with unique best (first) alignment came from the bowtie output:
CT/GA/CT:   67330780    ((converted) top strand)
GA/CT/CT:   0   (complementary to (converted) top strand)
GA/CT/GA:   0   (complementary to (converted) bottom strand)
CT/GA/GA:   67068417    ((converted) bottom strand)

Number of alignments to (merely theoretical) complementary strands being rejected in total: 0

Final Cytosine Methylation Report
=================================
Total number of C's analysed:   8534317167

Total methylated C's in CpG context:    600247485
Total methylated C's in CHG context:    10505543
Total methylated C's in CHH context:    30368232
Total methylated C's in Unknown context:    33758

Total unmethylated C's in CpG context:  89019071
Total unmethylated C's in CHG context:  1909589044
Total unmethylated C's in CHH context:  5894587792
Total unmethylated C's in Unknown context:  1778132

C methylated in CpG context:    87.1%
C methylated in CHG context:    0.5%
C methylated in CHH context:    0.5%
C methylated in Unknown context (CN or CHN):    1.9%
$ cat YPX55266-S-1/YPX55266-S-1_R1.fq.gz.temp.2_bismark_bt2_PE_report.txt
Bismark report for: YPX55266-S-1_R1.fq.gz.temp.2 and YPX55266-S-1_R2.fq.gz.temp.2 (version: v0.24.1)
Bismark was run with Bowtie 2 against the bisulfite genome of /public/share/acvrjv2n2r/data/dani/ with the specified options: -q -N 1 --score-min L,0,-0.2 -p 30 --reorder --ignore-quals --no-mixed --no-discordant --dovetail --maxins 500
Option '--directional' specified (default mode): alignments to complementary strands (CTOT, CTOB) were ignored (i.e. not performed)

Final Alignment report
======================
Sequence pairs analysed in total:   193542523
Number of paired-end alignments with a unique best hit: 134398921
Mapping efficiency: 69.4% 
Sequence pairs with no alignments under any condition:  34348308
Sequence pairs did not map uniquely:    24795294
Sequence pairs which were discarded because genomic sequence could not be extracted:    181

Number of sequence pairs with unique best (first) alignment came from the bowtie output:
CT/GA/CT:   67332713    ((converted) top strand)
GA/CT/CT:   0   (complementary to (converted) top strand)
GA/CT/GA:   0   (complementary to (converted) bottom strand)
CT/GA/GA:   67066027    ((converted) bottom strand)

Number of alignments to (merely theoretical) complementary strands being rejected in total: 0

Final Cytosine Methylation Report
=================================
Total number of C's analysed:   8534222425

Total methylated C's in CpG context:    600145589
Total methylated C's in CHG context:    10509176
Total methylated C's in CHH context:    30350412
Total methylated C's in Unknown context:    34047

Total unmethylated C's in CpG context:  89023809
Total unmethylated C's in CHG context:  1909483586
Total unmethylated C's in CHH context:  5894709853
Total unmethylated C's in Unknown context:  1780322

C methylated in CpG context:    87.1%
C methylated in CHG context:    0.5%
C methylated in CHH context:    0.5%
C methylated in Unknown context (CN or CHN):    1.9%
FelixKrueger commented 1 year ago

Hmm, this is frustrating, it would have been really interesting to learn why the run didn't complete the merging process...

But generally I would agree that the runs should be complete, because writing out the report file is pretty much the last step in a run. Fingers crossed that the BAM files are not truncated... If it was me, I would start the merging (using samtools cat) and then continue on with deduplication, and extraction.

If you have the nerve I would re-run the mapping just to see if it completes this time, or to find out what went wrong :P

AlisaGU commented 1 year ago

I decided to rerun this process... (T_T)

it would have been really interesting to learn why the run didn't complete the merging process...

I will come back to update information after the rerun process

AlisaGU commented 1 year ago

Hi, I'm back! No error was reported during rerun!

FelixKrueger commented 1 year ago

Fantastic, that's the best solution! Even though now we will never find out what went wrong...

AlisaGU commented 1 year ago

Enmmm, I deleted my script after submitting it to slurm system. That's probably the reason.