Closed radwaraed closed 3 years ago
I'm getting a similar error. Many lines of variations on this error:
[E::idx_find_and_load] Could not retrieve index file for '/tmp/telomerecat_bam2length-dljprax2/37240_16_1_MMb8vb7_.bam'
And then eventually this one:
[E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes
And finally this one:
Process Task-10:2: Traceback (most recent call last): File "/home/vhanlon/miniconda2/envs/py3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "parabam/core.pyx", line 291, in parabam.core.Task.run File "parabam/core.pyx", line 311, in parabam.core.Task.__generate_results__ File "parabam/command/core.pyx", line 50, in parabam.command.core.Task.__process_task_set__ File "pysam/libcalignmentfile.pyx", line 2187, in pysam.libcalignmentfile.IteratorRowAll.__next__ OSError: truncated file
I've tried running with BAM files from a variety of relative and absolute paths, also setting a different tmp directory using --temp_dir. I've also tried running this on two different servers, but with the same result.
Hi, I am also seeing the same issue.
My files are lossless bam files using novaseq6000
My error file for the bam2telbam step shows these lines for 25,000 lines: [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/tmp_telbams_p3/telomerecat_bam2telbam-b306o1wu/chaser_31801_0_115302.bam' E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/tmp_telbams_p3/telomerecat_bam2telbam-b306o1wu/31792_1_0XX0v0-0.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/tmp_telbams_p3/telomerecat_bam2telbam-b306o1wu/31792_2_0MMb14vb7-0.bam'
I still created a telbam file of size ~6MB
My error file for the telbam2length step shows these lines: [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115425_telbam.bam' [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes Process Task-3:1: Traceback (most recent call last): File "/home/p/psb7/miniconda3/envs/Telomere/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap self.run() File "parabam/core.pyx", line 291, in parabam.core.Task.run File "parabam/core.pyx", line 311, in parabam.core.Task.__generate_results File "parabam/command/core.pyx", line 50, in parabam.command.core.Task.process_task_set File "pysam/libcalignmentfile.pyx", line 2187, in pysam.libcalignmentfile.IteratorRowAll.next__ OSError: truncated file
This is a warning from pysam. There is a workaround but it has no impact on the processing:
Hi Keiran, I agree with you I did google the error and see that pysam is giving a warning too. However, although I am able to generate a telbam file using bam2telbam, I'm unable to generate a telbam2length output, not sure what I'm doing wrong: I'm running this script: /home/miniconda3/envs/Telomere/bin/telomerecat telbam2length -v2 /scratch/telbams_output_p3/115_telbam.bam --temp_dir /scratch/wtccc/psb7/tmp_length_p3 --output /scratch/telbam_to_length_output_p3/p3_length_est_115.csv Many thanks
Hi Keiran,
I re-sorted my bam file again using samtools -o 115234_sorted.bam 115234.bam and re-indexed the resulting *_sorted.bam, in case there was something wrong with my lossless bam file.
I get a 115234_sorted_telbam.bam of size 11.2GB
However, the telbam to length step seems to be left hanging, and will not generate a result, even after 12h, when I used python2 method this took about 30 minutes:
/etc/profile.d/hpc-login.sh: line 37: export: `234': not a valid identifier [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes [E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes[E::idx_find_and_load] Could not retrieve index file for '/scratch/wtccc/psb7/telbams_output_p3/115234_sorted_telbam.bam' [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes Process Task-3:2: Traceback (most recent call last): File "/home/p/psb7/miniconda3/envs/Telomere/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap self.run() File "parabam/core.pyx", line 291, in parabam.core.Task.run File "parabam/core.pyx", line 311, in parabam.core.Task.__generate_results File "parabam/command/core.pyx", line 50, in parabam.command.core.Task.process_task_set File "pysam/libcalignmentfile.pyx", line 2187, in pysam.libcalignmentfile.IteratorRowAll.next__ OSError: truncated file =>> PBS: job killed: walltime 43206 exceeded limit 43200
@duran72 I believe your problem is unrelated to the original intent of this issue, however do you have sufficient available disk space? The error specifically stated that the file being read has been truncated, it's not clear but I suspect this is the intermediate file, so this would be pointing at the --temp_dir
area space being exhausted.
This may not be the cause, it's difficult to determine when the process is killed. Please note that the original developer of this tool is not longer providing the support, we are attempting to improve the code and a small hotfix will be released very soon, however I don't think it will affect the issue above.
Hi Keiran,
Thanks for attempting to address my problem, disc space shouldn't be a problem as it is all being run on our University Scratch area that has plenty of space. I managed to get results when I used python2, but I found when I re-ran the telbam to length script that I would get a different length result each time, and many of my samples gave results that showed the telomere lengths were too short, which didn't compare favourably with what I found using telseq and computel, so I thought I'd use the newer version python3 to see what results that gave.
As I understand it to get the same answer each time you need to set --seed_randomness
.
Additionally specifying -t 75
is known to give more consistent and comparable results to other tools such as telseq... this is knowledge I've picked up from our scientific staff this week.
The typos and docs updates have been committed. The remaining part of this conversation has been moved to a new issue
Hi, I am trying to use telomerecat on a bamfile (whose index is in the same path), yet I am getting lines & lines of "Could not retrieve index file", followed by "telomerecat stopped unexpecedtly". ValueError: file does not contain alignment data
I would also recommend to fix the typo: unexpecedtly > unexpectedly :) but that is minor