Closed RSherman15 closed 5 years ago
BLASR often calls abrt()
instead of printing a useful error message before dying. That's why you see "114301 Aborted" in the output. So, it could be crashing for any reason.
What kind of data are you aligning? Is this PacBio Sequel or RS II? Are they in PacBio subread BAM files, or something else?
Ah, I just re-read the documentation; I didn't realize I needed the raw subread bams or bax.h5 files. I'm downloading the data in the proper format and hopefully this will solve the problem. Thanks.
I am still unable to align, with the following error:
rule aln_align_batch:
input: align/batches/15.fofn, reference/ref.fasta, reference/ref.fasta.fai, reference/ref.fasta.sa
output: align/bam/15.bam, align/bam/15.bam.bai, align/bam/unaligned/15.fa.gz
log: align/bam/log/15.log
jobid: 7
wildcards: batch_id=15
Job counts:
count jobs
1 aln_align_batch
1
[Sun Feb 10 15:48:23 2019]
Error in rule aln_align_batch:
jobid: 0
output: align/bam/15.bam, align/bam/15.bam.bai, align/bam/unaligned/15.fa.gz
log: align/bam/log/15.log
RuleException:
CalledProcessError in line 80 of /home-4/rsherma8@jhu.edu/bin/packages/smrtsv2/rules/align.snakefile:
Command ' set -euo pipefail; ALIGN_BATCH_TEMP=/tmp/aln_align_batch_15; mkdir -p ${ALIGN_BATCH_TEMP}; echo "Aligning batch 15..." >align/bam/log/15.log; blasr align/batches/15.fofn reference/ref.fasta --unaligned align/bam/unaligned/15.fa --out ${ALIGN_BATCH_TEMP}/batch_out.bam --sam --sa reference/ref.fasta.sa --nproc 8 --clipping subread --bestn 2 --maxAnchorsPerPosition 100 --advanceExactMatches 10 --affineAlign --affineOpen 100 --affineExtend 0 --insertion 5 --deletion 5 --extend --maxExtendDropoff 50 >>align/bam/log/15.log 2>&1; echo "Sorting..." >>align/bam/log/15.log; samtools sort -@ 1 -m 4G -O bam -T ${ALIGN_BATCH_TEMP}/15 -o align/bam/15.bam ${ALIGN_BATCH_TEMP}/batch_out.bam >>align/bam/log/15.log 2>&1; echo "Indexing..." >>align/bam/log/15.log 2>&1; samtools index align/bam/15.bam >>align/bam/log/15.log 2>&1; echo "Compressing unaligned reads..." >>align/bam/log/15.log 2>&1; gzip align/bam/unaligned/15.fa >>align/bam/log/15.log 2>&1; echo "Cleaning temp \"${ALIGN_BATCH_TEMP}\"..." >>align/bam/log/15.log; rm -rf ${ALIGN_BATCH_TEMP}echo "Done aligning batch 15" >>align/bam/log/15.log; ' returned non-zero exit status 127.
File "/home-4/rsherma8@jhu.edu/bin/packages/smrtsv2/rules/align.snakefile", line 80, in __rule_aln_align_batch
File "/home-net/home-4/rsherma8@jhu.edu/bin/packages/smrtsv2/dep/conda/build/envs/python3/lib/python3.6/concurrent/futures/thread.py", line 55, in run
Exiting because a job execution failed. Look above for error message
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /work-zfs/mschatz1/rsherman/smrtSVRuns/.snakemake/log/2019-02-10T154607.854314.snakemake.log
Failed to align reads
I am using bax.h5 reads for HG002 Genome in a Bottle data (ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/HG002_NA24385_son/PacBio_MtSinai_NIST/hdf5/sequence.index.HG002_PacBio_MtSinai_NIST_hdf5_10102018). Command options used were:
smrtsv.py run --threads 8 --species human --sample HG002 [ref] [bax.h5 reads fofn]
The one file in align/bam/log contains the following error message:
Aligning batch 15...
[INFO] 2019-02-10T15:46:22 [blasr] started.
blasr: symbol lookup error: /home-net/home-4/rsherma8@jhu.edu/bin/packages/smrtsv2/dep/conda/build/envs/pacbio/bin/../lib/libblasr.so.5.3.1: undefined symbol: _ZNK2H510H5Location15getObjnameByIdxEy
As per this issue, https://github.com/PacificBiosciences/pbbioconda/issues/11, downgrading hdf5 in the pacbio conda environment to 1.10.3 to 1.10.2 appears to have solved the issue (though it hasn't run to completion yet, so we'll see). Leaving this open though since it seems to be an issue with the dependency build.
Thanks. I made a change to the build, and I'll be testing this soon.
Sounds good -- it did not run to completion, I now get the same CalledProcess error with the following error in align/bam/log/15.log, though it seems to have run for quite a bit longer than with the previous error:
Aligning batch 15...
[INFO] 2019-02-14T16:42:40 [blasr] started.
terminate called after throwing an instance of 'std::runtime_error'
what(): could not write record
I think I have seen that error if the file system fills up. Is there space left on the destination device?
Yes, this is a very large server with more than ample space and RAM.
It may be using a temporary directory on a partition that does not have enough space. At the top of the log file (above "Aligning batch 15..."), it will say ALIGN_BATCH_TEMP=...
. Check that location and make sure it's on a partition that has space.
Thanks. Just fyi, it doesn't actually contain this info in the log file -- the first line is "Aligning batch 15". It does contain it in the error message though (in the CalledProcessError). This may be the problem, though, so thank you. I've reset my TEMP_DIR environment variable,and am re-running. It might be nice to add a user option to set the temporary directory though or at least mention it in documentation that it will write to /tmp. I never considered that this might be writing massive files to /tmp, since I specify an output directory.
Ah, I see now from issue #9 this is an argument of smrtsv, just not of "smrtsv run" which is how I missed it before (having just done smrtsv run -h to see options).
--tempdir is useful for pushing IO intensive processes to fast temporary storage and for keeping unecessary IO off of distributed filesystems (these temp files are never shared among rules). I updated the documentation to better describe global and per-step command-line options, and I added some information on temporary directories.
Perhaps follows from issue #5 if I messed something up by commenting out
ref_ctab='reference/ref.fasta.ctab'
, but I commented out the line, align seemed to begin to run, but then yielded an error shortly after: