Sequences were intentionally left out of this data to preserve tissue donor anonymity (you can see the white paper for this study here)
MISO throws the following exception:
File "/home/rs97a/.pyenv/versions/2.7.13/bin/miso", line 11, in <module>
load_entry_point('misopy==0.5.4', 'console_scripts', 'miso')()
File "/home/rs97a/.pyenv/versions/2.7.13/lib/python2.7/site-packages/misopy/miso.py", line 591, in main
wait_on_jobs=wait_on_jobs)
File "/home/rs97a/.pyenv/versions/2.7.13/lib/python2.7/site-packages/misopy/miso.py", line 372, in compute_all_genes_psi
given_read_len=read_len)
File "/home/rs97a/.pyenv/versions/2.7.13/lib/python2.7/site-packages/misopy/run_events_analysis.py", line 102, in check_gff_and_bam
seq_lens[len(bam_read.seq)] = True
TypeError: object of type 'NoneType' has no len()
The problem is in using the seq attribute of pysam.Samfile (deprecated in favor of pysam.AlignmentFile, btw) to obtain the read's length, since this attribute appears to be set to None when the sequence is not available. From misopy/run_events_analysis.py:
95 bam_file = pysam.Samfile(bam_filename, "rb")
96 n = 0
97 seq_lens = {}
98 for bam_read in bam_file:
99 # Check more of the reads for mixed read lengths
100 if n >= (num_reads * 10):
101 break
102 seq_lens[len(bam_read.seq)] = True
103 n += 1.
But there's no need to rely on the sequence, especially since I don't believe it's otherwise used by MISO, when alignment length is available as both pysam.AlignmentFile.alen and pysam.AlignmentFile.query_alignment_length. If it's ever necessary, the length can probably be calculated from the CIGAR string as a fallback.
I'm trying to run MISO on aligned reads that look like this (in SAM format):
Sequences were intentionally left out of this data to preserve tissue donor anonymity (you can see the white paper for this study here)
MISO throws the following exception:
The problem is in using the
seq
attribute ofpysam.Samfile
(deprecated in favor ofpysam.AlignmentFile
, btw) to obtain the read's length, since this attribute appears to be set toNone
when the sequence is not available. From misopy/run_events_analysis.py:But there's no need to rely on the sequence, especially since I don't believe it's otherwise used by MISO, when alignment length is available as both
pysam.AlignmentFile.alen
andpysam.AlignmentFile.query_alignment_length
. If it's ever necessary, the length can probably be calculated from the CIGAR string as a fallback.