Closed byb121 closed 3 years ago
From one user:
I have run one my bam files with 50GB size, however, it runs over 100 hours without producing output. I assigned 4cpus in my job script, and also ran another script with 8cpus assigned together with argument "-p8" in the telomereact command (see below), but that is running over time too. I wonder if you could please advise me how to resolve this. Thanks very much.
!/bin/bash
PBS -l select=1:ncpus=8:mem=32G,walltime=100:00:00
telomerecat bam2length -p8 sample.sort.bam --output sample.csv
After 100 hours of running I see the following massage in the error file:
OSError: truncated file =>> PBS: job killed: walltime 360028 exceeded limit 360000
@eb32142, Could you subset your BAM file with a command like this:
samtools view -b -s 2.1 sample.sort.bam > ten_percent_of_sample.sort.bam
And run this command without using your PBS system:
telomerecat bam2length -p8 ten_percent_of_sample.sort.bam --output sample.csv -v 2
-v
will output more detailed messages in your terminal. If it fails, could you post your terminal output here please?
Thanks,
The job
@eb32142, Could you subset your BAM file with a command like this:
samtools view -b -s 2.1 sample.sort.bam > ten_percent_of_sample.sort.bam
And run this command without using your PBS system:
telomerecat bam2length -p8 ten_percent_of_sample.sort.bam --output sample.csv -v 2
-v
will output more detailed messages in your terminal. If it fails, could you post your terminal output here please?Thanks,
@byb121,
I am running this and my job is still in running after 22 hours. The subset bam file is 6.4 GB.
That does not sound right. Would you be able to kill the job and post the log here, or send it to my email if it's too long for this thread. Thanks.
Thanks @byb121,
I think the issue might be the cpu usage for the job. I am running this on a HPC. Below is my job running status showing low cpu being used:
Job_Name = test2.sh resources_used.cpupercent = 2 resources_used.cput = 01:02:22 resources_used.mem = 33554432kb resources_used.ncpus = 8 resources_used.vmem = 33554432kb resources_used.walltime = 35:45:19 job_state = R
Could you use the script below to run your job?
#!/bin/bash
#PBS -l select=1:ncpus=8:mem=32G,walltime=100:00:00
#PBS -o job_parallel.log // output file
#PBS -e job_parallel.log // error output file
telomerecat bam2length -p8 ten_percent_of_sample.sort.bam --output sample.csv -v 2
Let it run for an hour and kill the job. Please then send me job_parallel.log
via email or post it here if it's not hugely long.
I suspect telomerecat failed to manage all its threads in your computing environment. To test it, could you also run a job use this script?
#!/bin/bash
#PBS -l mem=16G,walltime=100:00:00
#PBS -o job.log // output file
#PBS -e job.log // error output file
telomerecat bam2length ten_percent_of_sample.sort.bam --output sample.csv -v 2
Again let it run for an hour and kill the job. Please then send me job.log
via email or post it here.
Thanks,
Hi Yaobo (@byb121 ),
Thanks for your response. I ran the first command and below is the log file. I get lots of "Could not retrive index..." followed by the last lines including the errors as follow:
[E::idx_find_and_load] Could not retrieve index file for '/tmp/telomerecat_bam2length-cb7jynhu/ten_percent_of_sample.sort_telbam.bam'
[E::bgzf_read] [E::idx_find_and_load] Read block operation failed with error 2 after 0 of 4 bytesCould not retrieve index file for'/tmp/telomerecat_bam2length-cb7jynhu/ten_percent_of_sample.sort_telbam.bam' [E::idx_find_and_load] [E::idx_find_and_load] Could not retrieve index file for '/tmp/telomerecat_bam2length-cb7jynhu/ten_percent_of_sample.sort_telbam.bam'Could not retrieve index file for '/tmp/telomerecat_bam2length-cb7jynhu/ten_percent_of_sample.sort_telbam.bam'
[E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes Process Task-10:2: Traceback (most recent call last): File "/home/em924/anaconda3/envs/telomerecat/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "parabam/core.pyx", line 291, in parabam.core.Task.run File "parabam/core.pyx", line 311, in parabam.core.Task.__generate_results File "parabam/command/core.pyx", line 50, in parabam.command.core.Task.process_task_set File "pysam/libcalignmentfile.pyx", line 2187, in pysam.libcalignmentfile.IteratorRowAll.next__ OSError: truncated file
Thanks.
On Fri, Jun 26, 2020 at 8:39 PM Yaobo Xu notifications@github.com wrote:
Could you use the script below to run your job?
!/bin/bash
PBS -l select=1:ncpus=8:mem=32G,walltime=100:00:00
PBS -o job_parallel.log // output file
PBS -e job_parallel.log // error output file
telomerecat bam2length -p8 ten_percent_of_sample.sort.bam --output sample.csv -v 2
Let it run for an hour and kill the job. Please then send me job_parallel.log via email or post it here if it's not hugely long.
I suspect telomerecat failed to manage all its threads in your computing environment. To test it, could you also run a job use this script?
!/bin/bash
PBS -l mem=16G,walltime=100:00:00
PBS -o job.log // output file
PBS -e job.log // error output file
telomerecat bam2length ten_percent_of_sample.sort.bam --output sample.csv -v 2
Again let it run for an hour and kill the job. Please then send me job.log via email or post it here.
Thanks,
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cancerit/telomerecat/issues/13#issuecomment-650112130, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHHQVIZL7IS7SQVAGZKYH6TRYR3GFANCNFSM4OGQS6BA .
--
Ebrahim MahmoudiPhD Candidate Medical Genetics University of Newcastle Australia
Hi Yaobo,@byb121
Could it be an issue with samtools?
On Fri, Jun 26, 2020 at 8:39 PM Yaobo Xu notifications@github.com wrote:
Could you use the script below to run your job?
!/bin/bash
PBS -l select=1:ncpus=8:mem=32G,walltime=100:00:00
PBS -o job_parallel.log // output file
PBS -e job_parallel.log // error output file
telomerecat bam2length -p8 ten_percent_of_sample.sort.bam --output sample.csv -v 2
Let it run for an hour and kill the job. Please then send me job_parallel.log via email or post it here if it's not hugely long.
I suspect telomerecat failed to manage all its threads in your computing environment. To test it, could you also run a job use this script?
!/bin/bash
PBS -l mem=16G,walltime=100:00:00
PBS -o job.log // output file
PBS -e job.log // error output file
telomerecat bam2length ten_percent_of_sample.sort.bam --output sample.csv -v 2
Again let it run for an hour and kill the job. Please then send me job.log via email or post it here.
Thanks,
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cancerit/telomerecat/issues/13#issuecomment-650112130, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHHQVIZL7IS7SQVAGZKYH6TRYR3GFANCNFSM4OGQS6BA .
--
Ebrahim MahmoudiPhD Candidate Medical Genetics University of Newcastle Australia
It's complaining it can't find the index file. Could you run this first before you submit your PBS scripts:
samtools index ten_percent_of_sample.sort.bam
Then again, let it run for about an hour and please post any error here. Thanks.
Thanks @byb121 ,
I made the index and then ran the telomercat, but again got the exact same error as before:
[E::idx_find_and_load] Could not retrieve index file for '/tmp/telomerecat_bam2length-3yrilcal/ten_percent_of_sample.sort_telbam.bam' [E::idx_find_and_load] Could not retrieve index file for '/tmp/telomerecat_bam2length-
[E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes [E::bgzf_read] Read block operation failed with error 2 after 0 of 4 bytes Process Task-12:2: Traceback (most recent call last): File "/home/em924/anaconda3/envs/telomerecat/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "parabam/core.pyx", line 291, in parabam.core.Task.run File "parabam/core.pyx", line 311, in parabam.core.Task.__generate_results File "parabam/command/core.pyx", line 50, in parabam.command.core.Task.process_task_set File "pysam/libcalignmentfile.pyx", line 2187, in pysam.libcalignmentfile.IteratorRowAll.next__ OSError: truncated file
Hi @eb32142 , if you revert pysam version to 0.15.3, it may just run without a problem. Please let us know if it solves the problem.
Closing as no response
From one user:
I have run one my bam files with 50GB size, however, it runs over 100 hours without producing output. I assigned 4cpus in my job script, and also ran another script with 8cpus assigned together with argument "-p8" in the telomereact command (see below), but that is running over time too. I wonder if you could please advise me how to resolve this. Thanks very much.
!/bin/bash
PBS -l select=1:ncpus=8:mem=32G,walltime=100:00:00
telomerecat bam2length -p8 sample.sort.bam --output sample.csv