databio / pepatac

A modular, containerized pipeline for ATAC-seq data processing
http://pepatac.databio.org
BSD 2-Clause "Simplified" License
54 stars 15 forks source link

signal 11 error with Bowtie2 hg38 alignment #300

Open oliviacwhite opened 5 days ago

oliviacwhite commented 5 days ago

I am trying to run the PEPATAC using conda. Currently, I am at the looper run examples/test_project/test_config_refgenie.yaml step and it runs fine until it reaches the bowtie2 hg38 genome alignment, where it outputs a signal 11 error (attached below). I run this command using a slurm job so I don't think it is a memory availability issue. My only guess is that there is something wrong with the bowtie2_index as I pulled it from refgenie using refgenie pull hg38/bowtie2_index . The reason I think this is that it consistently downloaded the index (reaching 100%), but then said that it was killed. I think my bowtie2_index for hg38 is downloaded correctly, as it has the .bt2 files in the directory, however, I have to guess that something is going wrong that I don't understand. I can't think of what to try because of the signal 11 index besides completely removing refgenie and pepatac from my terminal and restarting, but I am worried about running into the same issues. Any help is greatly appreciated!

bowtie2 -p 4 --very-sensitive -X 2000 --rg-id test1 -x /path/to/refgenie_genomes/alias/hg38/bowtie2_index/default/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4 -1 /path/to/pepatac/pepatac_test/results_pipeline/test1/prealignments/test1_rCRSd_unmap_R1.fq.gz -2 /path/to/pepatac/pepatac_test/results_pipeline/test1/prealignments/test1_rCRSd_unmap_R2.fq.gz | samtools view -bS - -@ 1 | samtools sort - -@ 1 -T /path/to/pepatac/pepatac_test/results_pipeline/test1/aligned_hg38/tmph3t1gkb8 -o /path/to/pepatac/pepatac_test/results_pipeline/test1/aligned_hg38/test1_temp.bam (2971903,2971904,2971905)

(ERR): bowtie2-align died with signal 11 (SEGV) 
[main_samview] fail to read the header from "-".
samtools sort: failed to read header from "-"

Command completed. Elapsed time: 0:00:00. Running peak memory: 0.103GB.
PID: 2971905; Command: samtools; Return code: 1; Memory used: 0.0GB
PID: 2971903; Command: bowtie2; Return code: 1; Memory used: 0.009GB
PID: 2971904; Command: samtools; Return code: 1; Memory used: 0.002GB

Child process 2971879 (perl) was already terminated. Starting cleanup: 3 files; 3 conditional files for cleanup

Cleaning up flagged intermediate files. . .

Conditional flag found: []

These conditional files were left in place:

Pipeline failed at: (11-18 12:51:19) elapsed: 0.0 TIME

Total time: 0:00:06 Failure reason: Subprocess returned nonzero result. Check above output for details Traceback (most recent call last): File "/path/to/pepatac/pipelines/pepatac.py", line 2779, in sys.exit(main()) File "/path/to/pepatac/pipelines/pepatac.py", line 1112, in main pm.run([cmd, cmd2], rmdup_bam, follow=check_alignment_genome) File "/path/to/miniconda3/envs/pepatac/lib/python3.9/site-packages/pypiper/manager.py", line 1036, in run list_ret, maxmem = self.callprint( File "/path/to/miniconda3/envs/pepatac/lib/python3.9/site-packages/pypiper/manager.py", line 1316, in callprint self._triage_error(SubprocessError(msg), nofail) File "/path/to/miniconda3/envs/pepatac/lib/python3.9/site-packages/pypiper/manager.py", line 2539, in _triage_error self.fail_pipeline(e) File "/path/to/miniconda3/envs/pepatac/lib/python3.9/site-packages/pypiper/manager.py", line 2009, in fail_pipeline raise exc pypiper.exceptions.SubprocessError: Subprocess returned nonzero result. Check above output for details

donaldcampbelljr commented 5 days ago

Based on the error message:

(ERR): bowtie2-align died with signal 11 (SEGV) 
[main_samview] fail to read the header from "-".
samtools sort: failed to read header from "-"

It leads me to believe that the steps preceding | samtools view -bS - -@ 1 | samtools sort - -@ 1 -T are the issue.

You could run the first part of the command manually and see if it offers further insight:

bowtie2 -p 4  --very-sensitive  -X 2000  --rg-id test1 -x /path/to/refgenie_genomes/alias/hg38/bowtie2_index/default/2230c535660fb4774114bfa966a62f823fdb6d21acf138d4 -1 /path/to/pepatac/pepatac_test/results_pipeline/test1/prealignments/test1_rCRSd_unmap_R1.fq.gz -2 /path/to/pepatac/pepatac_test/results_pipeline/test1/prealignments/test1_rCRSd_unmap_R2.fq.gz

I do not believe you need to reinstall PEPATAC. However, given that the above command does rely on genome assets via refgenie, it may be worth clearing them and re-pulling them using the instructions here: https://pepatac.databio.org/en/latest/run-conda/