Open Shians opened 5 years ago
Thanks for the detailed report and debugging - your explanation makes sense. I will update the htslib version.
Out of curiosity, how did you get an alignment with 38,000 matches and no insertion/deletions in-between?
Jared
I believe it was aligned back to assembled contigs, so the read itself would have informed the contig sequence.
Ah, that makes sense. Thanks.
Now updated to htslib-1.9
I had to revert this change as there were some dependency issues with htslib-1.9. Reopening until I can fix it
@jts hello, I have the htslib-1.9 in the system. When I compiled the nanopolish, I got the following errors: error: unknown type name 'hts_itr_multi_t' error: unknown type name 'hts_reglist_t'
I guess these errors related to the version of htslib. How can I fixed it, or should I wait you to update to htslib-1.9?
@NickyPan what does your make
command look like, and can you give the full output from it?
I suggest letting nanopolish clone and install the built-in htslib.
@jts I just use the make
without any parameter, and I got the following error:
make[1]:'Entering directory '/opt/seqtools/source/nanopolish/htslib'
gcc -g -Wall -O2 -I. -c -o hts.o hts.c
In file included from ./cram/cram_samtools.h:61:0,
from cram/cram.h:45,
from hts.c:39:
/opt/seqtools/include/htslib/sam.h:359:5: error: unknown type name 'hts_itr_multi_t'
hts_itr_multi_t *sam_itr_regions(const hts_idx_t *idx, bam_hdr_t *hdr, hts_reglist_t *reglist, unsigned int regcount);
^
/opt/seqtools/include/htslib/sam.h:359:76: error: unknown type name 'hts_reglist_t'
hts_itr_multi_t *sam_itr_regions(const hts_idx_t *idx, bam_hdr_t *hdr, hts_reglist_t *reglist, unsigned int regcount);
^
make[1]: *** [hts.o] Error 1
make[1]: Leaving directory '/opt/seqtools/source/nanopolish/htslib'
make: *** [htslib/libhts.a] Error 255
and then I tried the make HTS=noinstall
. It worked. But I haven't run any text yet.
I helped a colleague debug an issue with nanopolish call-methylation, whereby a BAM file with no spliced alignments would be detected as having a spliced alignments, leading to the program stopping. The error is triggered here and the root cause is here where a CIGAR
N
op is detected.However, when inspecting the read in Samtools, there is no
N
present, instead it looks something like38000M3I...
. Now when stickingprintf
statements into the nanopolish source, I find that nanopolish misreads the first two operations, both in terms of length and operation, with the second op being recognised asN
. I BELIEVE this is due to the packaged htslib being over 2 years old, and that it has issues handlingop_len
longer than 2^15. Updating the htslib appeared to resolve the issue. Note that this is specifically the length of the CIGAR operation, not the the well-known and fixed issue CIGAR string.Sorry I cannot provide better detail, this was debugged on someone else's machine. I also cannot provide a reproducible example because of my lack of experience with fast5 files, but I believe if you can create a situation where a CIGAR operation is > 2^15 in
op_len
then you'll find that nanopolish misreads it and corrupts the next operation. If more information is needed to debug this I'm happy to go hijack my colleague's machine for a few hours.Latest versions of nanopolish and Samtools were used. Issue only occurs if using
htslib @ 3dc96c5
included as subdirectory, pulling in the latest master and recompiling fixes the issue.