jgurtowski / ectools

tools for error correction and working with long read data
BSD 3-Clause "New" or "Revised" License
44 stars 11 forks source link

Correct.sh -- Illegal instruction error during pb_correct.py #5

Closed vsuryaw closed 7 years ago

vsuryaw commented 7 years ago

I'm trying to run ectools to correct PacBio reads (about 1.1Gb) for a small microbial genome. Our cluster has PBS/Torque and no SGE scheduler. So I partitioned the data into 21 directories with ~500 files each. When I run the correct.sh script, it runs and successfully creates cor.fa files in about half of the directories. In the other directories however, correct.sh script fails at the final step --

`+ /home/linuxbrew/bin/python /home/software/ectoo ls/pb_correct.py p0015 p0015.delta.snps p0015.delta.r.sc 0.96 3000 p0015

../correct.sh: line 97: 17545 Illegal instruction $python ${CORRECT_SCRIPT} ${FILE} ${FILTERED_DELTA}.snps ${FILTERED_DELTA}.r.sc ${CLR_PCT_ID} ${MIN_READ_LEN} ${FILE}`

I tried to re-run the correct.sh script in these folders after removing all the intermediate delta files etc. but it fails at the same step again and again. I'm unable to wrap my head around this issue, given that the very same correct.sh script works for other folders/data partitions.

Can you kindly help me in resolving this issue? Here is the command I'm giving to run the correct.sh script --

for j in {1..500}; do echo "SGE_TASK_ID=$j TMPDIR=/tmp ../correct.sh"; done | parallel -j 16

jgurtowski commented 7 years ago

It's unclear which program is at fault from the limited debugging output. It sounds like you divided the reads up into pretty small partitions. If only a few partitions are not completing you could just leave those out of the analysis. You will probably only lose a few reads?

Thanks, James

On Fri, Jan 13, 2017 at 9:47 PM, vs notifications@github.com wrote:

I'm trying to run ectools to correct PacBio reads (about 1.1Gb) for a small microbial genome. Our cluster has PBS/Torque and no SGE scheduler. So I partitioned the data into 21 directories with ~500 files each. When I run the correct.sh script, it runs and successfully creates cor.fa files in about half of the directories. In the other directories however, correct.sh script fails at the final step --

`+ /home/linuxbrew/bin/python /home/software/ectoo ls/pb_correct.py p0015 p0015.delta.snps p0015.delta.r.sc 0.96 3000 p0015

../correct.sh: line 97: 17545 Illegal instruction $python ${CORRECT_SCRIPT} ${FILE} ${FILTERED_DELTA}.snps ${FILTERED_DELTA}.r.sc ${CLR_PCT_ID} ${MIN_READ_LEN} ${FILE}`

I tried to re-run the correct.sh script in these folders after removing all the intermediate delta files etc. but it fails at the same step again and again. I'm unable to wrap my head around this issue, given that the very same correct.sh script works for other folders/data partitions.

Can you kindly help me in resolving this issue? Here is the command I'm giving to run the correct.sh script --

for j in {1..500}; do echo "SGE_TASK_ID=$j TMPDIR=/tmp ../correct.sh"; done | parallel -j 16

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jgurtowski/ectools/issues/5, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLfv26wEfOX-CDzMee5JFY2GLCWQmfaks5rSDcngaJpZM4Ljf5W .

vsuryaw commented 7 years ago

Thanks James for your response. The trouble is I'm losing 9 out of 21 directories in this manner. I can send you more debugging output if that would help in resolving the issue. What more info. would you need me to post here?

jgurtowski commented 7 years ago

As much as you can, it's just unclear which program is throwing the error. You are not using PBS/Torque correct? Just running the parallel command in each directory?

On Thu, Jan 19, 2017 at 3:50 PM, vs notifications@github.com wrote:

Thanks James for your response. The trouble is I'm losing 9 out of 21 directories in this manner. I can send you more debugging output if that would help in resolving the issue. What more info. would you need me to post here?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jgurtowski/ectools/issues/5#issuecomment-273894459, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLfvxbtVSYNBMm2zP-Z9qOGdWbf4-QRks5rT8yRgaJpZM4Ljf5W .

vsuryaw commented 7 years ago

I'm actually just submitting short scripts to run the parallel command in each directory, the bash script submits to our cluster which does use PBS/Torque. Here's the script for directory partition 0010--

#!/bin/bash

#PBS -l nodes=1:ppn=16
#PBS -l walltime=6:00:00
#PBS -l mem=8000mb,vmem=8000mb
#PBS -q main

#change to dir with pacbio data partitions to correct
cd ~/workspace/ectools_correction/43P1S1/0010 

#run the correction.sh script
for j in {1..500}; do echo "SGE_TASK_ID=$j TMPDIR=/tmp ../correct.sh"; done | parallel -j 16

So out of 0001 - 0021 directories containing partitions of PB reads, about 9 directories don't get a single cor.fa file. I'm attaching the error file temp_0010.PBS.e.txt

Please let me know, if you need anything else.

jgurtowski commented 7 years ago

Can you run :

for j in {1..500}; do echo "SGE_TASK_ID=$j TMPDIR=/tmp ../correct.sh"; done | parallel -j 16

In one of the failing directories but not through PBS. Just on a server by itself?

Thanks, James

On Thu, Jan 19, 2017 at 4:33 PM, vs notifications@github.com wrote:

I'm actually just submitting short scripts to run the parallel command in each directory, the bash script submits to our cluster which does use PBS/Torque. Here's the script for directory partition 0010--

!/bin/bash

PBS -l nodes=1:ppn=16

PBS -l walltime=6:00:00

PBS -l mem=8000mb,vmem=8000mb

PBS -q main

change to dir with pacbio data partitions to correct

cd ~/workspace/ectools_correction/43P1S1/0010

run the correction.sh script

for j in {1..500}; do echo "SGE_TASK_ID=$j TMPDIR=/tmp ../correct.sh"; done | parallel -j 16

So out of 0001 - 0021 directories containing partitions of PB reads, about 9 directories don't get a single cor.fa file. I'm attaching the file temp_0010.PBS.e.txt https://github.com/jgurtowski/ectools/files/717831/temp_0010.PBS.e.txt

Please let me know, if you need anything else.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jgurtowski/ectools/issues/5#issuecomment-273905309, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLfv4uXQBiSofxAMs87O-ogQ7SzdoIuks5rT9aNgaJpZM4Ljf5W .

vsuryaw commented 7 years ago

Yes, I have run that command within the directory manually and it gives me the same error. Here's a snippet --

4: FINISHING DATA
+ cp p0003.delta /staging/sn1/vasantis/Alex_Assembly/workspace/ectools_correction/43P1S1/0010
+ FILTERED_DELTA=p0003.delta
+ [[ false == true ]]
+ delta-filter -l 225 -i 70.0 -r p0003.delta
+ cp p0003.delta.r /staging/sn1/vasantis/Alex_Assembly/workspace/ectools_correction/43P1S1/0010
+ show-coords -l -H -r p0003.delta.r
+ cp p0003.delta.r.sc /staging/sn1/vasantis/Alex_Assembly/workspace/ectools_correction/43P1S1/0010
+ show-snps -H -l -r p0003.delta.r
+ cp p0003.delta.snps /staging/sn1/vasantis/Alex_Assembly/workspace/ectools_correction/43P1S1/0010
+ /home/cmb-07/sn1/vasantis/software/brew/linuxbrew/bin/python /home/cmb-07/sn1/vasantis/software/ectools/pb_correct.py p0003 p0003.delta.snps p0003.delta.r.sc 0.96 3000 p0003
../correct.sh: line 97: 16328 Illegal instruction     $python ${CORRECT_SCRIPT} ${FILE} ${FILTERED_DELTA}.snps ${FILTERED_DELTA}.r.sc ${CLR_PCT_ID} ${MIN_READ_LEN} ${FILE}

Best regards, Vasantika

jgurtowski commented 7 years ago

What version of python are you running?

On Thu, Jan 19, 2017 at 4:58 PM, vs notifications@github.com wrote:

Yes, I have run that command within the directory manually and it gives me the same error. Here's a snippet --

4: FINISHING DATA

  • cp p0003.delta /staging/sn1/vasantis/Alex_Assembly/workspace/ectools_correction/43P1S1/0010
  • FILTERED_DELTA=p0003.delta
  • [[ false == true ]]
  • delta-filter -l 225 -i 70.0 -r p0003.delta
  • cp p0003.delta.r /staging/sn1/vasantis/Alex_Assembly/workspace/ectools_correction/43P1S1/0010
  • show-coords -l -H -r p0003.delta.r
  • cp p0003.delta.r.sc /staging/sn1/vasantis/Alex_Assembly/workspace/ectools_correction/43P1S1/0010
  • show-snps -H -l -r p0003.delta.r
  • cp p0003.delta.snps /staging/sn1/vasantis/Alex_Assembly/workspace/ectools_correction/43P1S1/0010
  • /home/cmb-07/sn1/vasantis/software/brew/linuxbrew/bin/python /home/cmb-07/sn1/vasantis/software/ectools/pb_correct.py p0003 p0003.delta.snps p0003.delta.r.sc 0.96 3000 p0003 ../correct.sh: line 97: 16328 Illegal instruction $python ${CORRECT_SCRIPT} ${FILE} ${FILTERED_DELTA}.snps ${FILTERED_DELTA}.r.sc ${CLR_PCT_ID} ${MIN_READ_LEN} ${FILE}

Best regards, Vasantika

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jgurtowski/ectools/issues/5#issuecomment-273911817, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLfv7zCsSnSKbP7Yw_ykWFT5S0QPjXuks5rT9yUgaJpZM4Ljf5W .

vsuryaw commented 7 years ago

Hi, Im running python version 2.7.13

vsuryaw commented 7 years ago

Hi James,

I just wanted to inform that this seems to have been a technical issue on the side of our HPC cluster configuration. When I run the commands on head node, I do get the outputs in the directories that were failing earlier. I have yet no explanation for this bizzare behaviour of our cluster.

I thank you for your time in trying to troubleshoot this with me.

best regards!