cancerit / telomerecat

Telomerecat: The telomere computational analysis tool
GNU General Public License v3.0
20 stars 5 forks source link

Telomerecat hangs #23

Closed keiranmraine closed 3 years ago

keiranmraine commented 3 years ago

Hi Keiran,

I re-sorted my bam file again using samtools -o 115234_sorted.bam 115234.bam and re-indexed the resulting *_sorted.bam, in case there was something wrong with my lossless bam file.

I get a 115234_sorted_telbam.bam of size 11.2GB

However, the telbam to length step seems to be left hanging, and will not generate a result, even after 12h, when I used python2 method this took about 30 minutes:

/etc/profile.d/hpc-login.sh: line 37: export: `234': not a valid identifier [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes[E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes Process Task-3:2: Traceback (most recent call last): File "/home/p/psb7/miniconda3/envs/Telomere/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap self.run() File "parabam/core.pyx", line 291, in parabam.core.Task.run File "parabam/core.pyx", line 311, in parabam.core.Task.__generate_results File "parabam/command/core.pyx", line 50, in parabam.command.core.Task.process_task_set File "pysam/libcalignmentfile.pyx", line 2187, in pysam.libcalignmentfile.IteratorRowAll.next__ OSError: truncated file =>> PBS: job killed: walltime 43206 exceeded limit 43200

Originally posted by @duran72 in https://github.com/cancerit/telomerecat/issues/20#issuecomment-787808939

keiranmraine commented 3 years ago

Likely linked to #15

keiranmraine commented 3 years ago

I'm pretty sure that this relates to the fact it uses fork for the multiprocessing mode:

The fork start method should be considered unsafe as it can lead to crashes of the subprocess. See bpo-33725.

I've tried switching to spawn but unfortunately the parabam library is not using pysam in a thread safe way meaning this isn't possible (and adds addtional complication to running under docker). Will give forkserver a try but may have same issues with pysam.

keiranmraine commented 3 years ago

see v4.0.0