alimanfoo / pysamstats

A fast Python and command-line utility for extracting simple statistics against genome positions based on sequence alignments from a SAM or BAM file.
191 stars 43 forks source link

Running pysamstats with input from /tmp #135

Closed stewalsh04 closed 2 years ago

stewalsh04 commented 2 years ago

Hi Pysamstats team,

I am attempting to integrate Pysamstats into my Common Workflow Language (CWL) pipeline, the pipeline copies files to /tmp or makes symbolic links in /tmp before running. The command the pipeline used was:

pysamstats --type variation --max-depth=1000000 --chromosome R9ref -f /tmp/tmpb21ic0sb/stgc4860349-f5d6-450b-89e6-1ae70aa374cc/R9ref.fasta /tmp/8pwpdhof/sorted.bam

It failed with the following error:

ValueError: no index available for pileup

I can confirm all files including the index .fai file were in the temporary folders.

After a lot of playing around by copying files myself into /tmp to troubleshoot, I realized it's actually not a problem with the .fai file, but with the bam file, when the bam file is in /tmp the program crashes with the above error.

The command below worked ok with the bam file in /home:

pysamstats --type variation --max-depth=1000000 --chromosome R9ref -f /tmp/tmpb21ic0sb/stgc4860349-f5d6-450b-89e6-1ae70aa374cc/R9ref.fasta /home/steve/cwl/samtools/sorted.bam

But as soon as I try to run again with the bam file back in /tmp it crashed again.

Here is the full error:

chrom pos ref reads_all reads_pp matches matches_pp mismatches mismatches_pp deletions deletions_pp insertions insertions_pp A A_pp C C_pp T T_pp G G_pp N N_pp Traceback (most recent call last): File "/home/steve/anaconda3/envs/pipeline/bin/pysamstats", line 253, in <module> **kwargs File "/home/steve/anaconda3/envs/pipeline/lib/python3.7/site-packages/pysamstats/io.py", line 63, in write_csv for row in rows: File "/home/steve/anaconda3/envs/pipeline/lib/python3.7/site-packages/pysamstats/util.py", line 27, in <genexpr> it = (getter(rec) for rec in recs) File "pysamstats/opt.pyx", line 1853, in iter_pileup_default File "pysam/libcalignmentfile.pyx", line 1335, in pysam.libcalignmentfile.AlignmentFile.pileup ValueError: no index available for pileup

I'm really rather stuck with this problem now, so I really hope you can help! :-) Thanks, Steve.

stewalsh04 commented 2 years ago

The pipeline was not staging the .bai along with the .fasta .fai and .bam file.