benedictpaten / marginAlign

UCSC Nanopore
MIT License
43 stars 13 forks source link

MarginStats gets index out of range for alignedSegment #23

Closed Jeltje closed 8 years ago

Jeltje commented 8 years ago

I ran marginAlign on nanopore data. Full disclosure: my database was Gencode transcripts, not genome, but it seemed to work just fine (the sam header was just really, really long). But when I tried marginStats on the output, I kept getting this index out of range error. I played some with the code, but I can't seem to find the problem.

Minimal example input here: marginStats test.sam test.fastq tx.fa --identity

Full error:

Traceback (most recent call last):
  File "/home/ubuntu/bin/marginAlign/src/margin/marginStats.py", line 97, in <module>
    main()
  File "/home/ubuntu/bin/marginAlign/src/margin/marginStats.py", line 70, in main
    referenceFastaFile, globalAlignment=not options.localAlignment)
  File "/home/ubuntu/bin/marginAlign/src/margin/utils.py", line 383, in getReadAlignmentStats
    refSequences[sam.getrname(aR.rname)], aR, globalAlignment), samIterator(sam))
  File "/home/ubuntu/bin/marginAlign/src/margin/utils.py", line 383, in <lambda>
    refSequences[sam.getrname(aR.rname)], aR, globalAlignment), samIterator(sam))
  File "/home/ubuntu/bin/marginAlign/src/margin/utils.py", line 312, in __init__
    for aP in AlignedPair.iterator(alignedRead, self.refSeq, self.readSeq):
  File "/home/ubuntu/bin/marginAlign/src/margin/utils.py", line 278, in iterator
    if aP.getReadBase().upper() != alignedSegment.query_alignment_sequence[readPos].upper():
IndexError: string index out of range
mitenjain commented 8 years ago

This looks like an issue with pysam. Are you using virtualenv with marginAlign (pysam==0.8.2.1)? If not, could you check your pysam version? Pysam can have significant variations within versions.

We are working on speeding marginStats as well as containerize it to reduce dependence on version specific libraries.

Jeltje commented 8 years ago

Yes, that did it. I have 0.9.1.3 on my system. Thanks!