Closed bistace closed 1 year ago
Hi, @bistace.
Can you provide the read name of the record that causes this error? A small example input would also be helpful.
Your code seems fine, particularly since the header is being read and parsed correctly. You can try running the util_alignment_view
example to confirm the error.
The code (mine and the one in the util_alignment_view
example) fails to read even the first record, which looks like that with samtools view | head -n 1
:
18460451 163 5 10 0 47M = 10 47 ATTATTCATTAAATAATACTTACCTGTATCTCAAAAACTTCTTGTAT @CCFFFFFHHHHHJJIJJJJJJJJJJHJJJJJJJJJJJJJJJJJHGI NM:i:0 ms:i:94 AS:i:94nn:i:0 tp:A:P cm:i:2 s1:i:43 s2:i:43 de:f:0 rl:i:0
Read and reference names were renamed to numbers prior to the mapping step. I forgot to mention that the file is read without any problem when given a SAM as input instead of a BAM.
I have tried to make a smaller bam file so you can test it out but for some reason, it works if the bam file is smaller so I guess it may have something to do with the header but I can't find what. I have uploaded the full bam file (around 16GB) here if you want to test with my file.
This is a great example input, thanks! The issue actually stems from how the alignment reader detects the format.
The first block of this file exceeds the buffer size of the inspector, which causes the BGZF decoder to fail. Unfortunately, this error was discarded, and a fallback format was chosen (bgzipped SAM rather than BAM), which in turn opened the wrong reader.
This is fixed in bf4f47d63b8d28d42d2e47113c4a640ad3161741, which swaps the block-based decoder to a streaming one for format detection. It is also now available in noodles-util 0.15.0.
Hello and thank you for developing noodles!
This is probably a very silly question and I must have missed something obvious but I am met with this error when trying to read a bam file:
Here is a MVP of what I did to encounter this error:
Could you please help me understand what is happening? Samtools view does not give any error when reading this BAM file.