Closed hidvegin closed 2 years ago
Yes, I would agree:
(ERR): bowtie2-align died with signal 9 (KILL)
Can you try to drop the --parallel 4
and tray again?
Thank you for your answer. I dropped the --multicore 4
and tried again. Now, I think the job is working but a little slow without the multicore.
Great to hear that it is working in general. Now you could monitor the core/RAM usage and adjust the --parallel
according to the hardware resources you have available if desired.
Dear @FelixKrueger!
I have two and three different bam files from the same library. Should I use --multiple
for deduplicate_bismark
script or before deduplication I merge bam files with samtools
?
Other question. I would like to analyse CpG, CHG and CHH context separately in SeqMonk
. So, I would like to generate CpG, CHG and CHH coverage file separately with bismark_methylation_extractor
. I used this parameters:
bismark_methylation_extractor --gzip --bedGraph --comprehensive --buffer_size 10G K1_aligned.deduplicated.bam
I got three txt.gz file (CpG, CHG and CHH), but only one coverage file from CpGs. What should I change in the parameters for I get three different coverage files about CpG, CHG and CHH contexts separately?
For the merging, either should work. In fact, deduplicate_bismark
does use samtools cat
internally when you use the --multiple
option.
Regarding the second question: you can simply run bismark2bedGraph
three times, only feeding it the context of interest. Since you have already generated the CpG coverage file, just move to the file containing the CHG and CHH files from the methylation extractor and run:
bismark2bedGraph --CX -o CHH_context.cov.gz CHH*
bismark2bedGraph --CX -o CHG_context.cov.gz CHG*
As a word of warning, there might be a LOT of cytosines in CHH context, so you might struggle somewhat with the downstream analysis. But it's certainly worth a shot!
Thank you for your answer and help.
I have got a new error in Bismarck deduplication step. I used this script for deduplication:
deduplicate_bismark --bam home/fk8jybr/output/bismark/align/BGI/11/11.11_R1_paired_bismark_bt2_pe.bam -o 11 -output_dir /home/fk8jybr/output/bismark/deduplication/BGI/11
I got this error:
Output will be written into the directory: /big/home/fk8jybr/output/bismark/deduplication/BGI/11/
Output filename was given as: 11
Neither -s (single-end) nor -p (paired-end) selected for deduplication. Trying to extract this information for each file separately from the @PG line of the SAM/BAM file
Processing single-end Bismark output file(s) (SAM format):
home/fk8jybr/output/bismark/align/BGI/11/11.11_R1_paired_bismark_bt2_pe.bam
If there are several alignments to a single position in the genome the first alignment will be chosen. Since the input files are not in any way sorted this is a near-enough random selection of reads.
Checking file >>home/fk8jybr/output/bismark/align/BGI/11/11.11_R1_paired_bismark_bt2_pe.bam<< for signs of file truncation...
Captured error message: '[E::hts_open_format] Failed to open file "home/fk8jybr/output/bismark/align/BGI/11/11.11_R1_paired_bismark_bt2_pe.bam" : No such file or directory'
[ERROR] The file appears to be truncated, please ensure that there were no errors while copying the file!!! Exiting...
I think the file is not corrupted. Bismark finished the alignment step in correctly. I do not know why the file be truncated. What should I do with this error?
[E::hts_open_format] Failed to open file "home/fk8jybr/output/bismark/align/BGI/11/11.11_R1_paired_bismark_bt2_pe.bam" : No such file or directory'
This would indicate that you spelled the name of the file, or the path to it, incorrectly. Is that a possibility, e.g. forgot /
before home/...
?
Thank @FelixKrueger for your answer and help. Yes, this was the problem. I forgot the /
before home
path.
I got an other error in the bedGraph phase. I used this script:
/home/fk8jybr/Bismark-0.23.1/bismark2bedGraph -o CpG_context.cov.gz --dir /home/fk8jybr/output/bismark/bedGraph/BGI/11 --buffer_size 70% /home/fk8jybr/output/bismark/methylation_extratction/BGI/11/CpG_context_11.deduplicated.txt.gz
The error was this:
Using these input files: /home/fk8jybr/output/bismark/methylation_extratction/BGI/11/CpG_context_11.deduplicated.txt.gz
Summary of parameters for bismark2bedGraph conversion:
======================================================
bedGraph output: CpG_context.cov.gz
output directory: >/big/home/fk8jybr/output/bismark/bedGraph/BGI/11/<
remove whitespaces: no
CX context: no (CpG context only, default)
No-header selected: no
Sorting method: Unix sort-based (smaller memory footprint, but slower)
Sort buffer size: 70%
Coverage threshold: 1
=============================================================================
Methylation information will now be written into a bedGraph and coverage file
=============================================================================
Using the following files as Input:
/big/home/fk8jybr/output/bismark/methylation_extratction/BGI/11/CpG_context_11.deduplicated.txt.gz
Writing bedGraph to file: CpG_context.cov.gz
Also writing out a coverage file including counts methylated and unmethylated residues to file: CpG_context.cov.gz.bismark.cov.gz
Changed directory to /big/home/fk8jybr/output/bismark/bedGraph/BGI/11/
Now writing methylation information for file >>CpG_context_11.deduplicated.txt.gz<< to individual files for each chromosome
Finished writing out individual chromosome files for CpG_context_11.deduplicated.txt.gz
Collecting temporary chromosome file information... Processing the following input file(s):
CpG_context_11.deduplicated.txt.gz.chr7B.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr1B.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr5A.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr6D.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr3B.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr6B.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr2B.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr2A.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr4A.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr4D.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr7D.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr1A.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr4B.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr3A.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr5B.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr6A.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr3D.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr2D.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chrUn.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr5D.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr7A.methXtractor.temp
CpG_context_11.deduplicated.txt.gz.chr1D.methXtractor.temp
Sorting input file CpG_context_11.deduplicated.txt.gz.chr1A.methXtractor.temp by positions (using -S of 70%)
Successfully deleted the temporary input file CpG_context_11.deduplicated.txt.gz.chr1A.methXtractor.temp
Sorting input file CpG_context_11.deduplicated.txt.gz.chr1B.methXtractor.temp by positions (using -S of 70%)
Missing alphabetical methylation call at /home/fk8jybr/Bismark-0.23.1/bismark2bedGraph line 562, <$ifh> line 60377208.
main::validate_methylation_call('+', undef) called at /home/fk8jybr/Bismark-0.23.1/bismark2bedGraph line 466
What I missed it? How can I resolve this error?
Hmm, the error message seems to indicate that the methylation call information for at least one line for the file CpG_context_11.deduplicated.txt.gz.chr1B.methXtractor.temp
is truncated... Did you get some kind of errors before, during the methylation call procedure? You could try to parse the input file to see if there is indeed something wrong.
Alternatively, instead of calling bismark2bedGraph
on its own you could invoke it straight away during the methylation extraction using the flag --bed
for the methylation extractor.
Thank @FelixKrueger for your answer. Seems to some kind of error in the CpG_context_11.deduplicated.txt.gz.chr1B.methXtractor.temp
file. The error is not in the original file which generated from the methylation call procedure. I use a HPC cluster, which seems to generate the I/O error. The administrator restarted the HPC cluster and I re-run the bedGraph process. Now, I think the error is not exists anymore.
I would like to generate CpG, CHG and CHH context also, so I should use bismark2bedGraph
, instead of --bed
in methylation extractor.
yes, if you wanted that you could run bismark2bedGraph
on the CpG, CHH and CHG output from the methylation extractor (you should already have the CpG output):
bismark2bedGraph --gzip --CX -o CHG_out.cov.gz CHG*
bismark2bedGraph --gzip --CX -o CHH_out.cov.gz CHH*
I would like to align 150 bp paired-end reads with Bismark v0.23.1. I use a HPC cluster with SLURM. I got this error in align process:
I think this is a memory issue, example the bowtie run out of memory but I am not sure. Is it a memory issue?