Closed kosmasgal closed 1 year ago
Alternative error I've had with this file:
[W::sam_read1_sam] Parse error at line 117064
samtools view: error reading file "-"
[M::mm_idx_gen::55.316*2.13] collected minimizers
[M::mm_idx_gen::79.227*2.30] sorted minimizers
[M::main::79.227*2.30] loaded/built the index for 61 target sequence(s)
[M::mm_mapopt_update::82.605*2.24] mid_occ = 650
[M::mm_idx_stat] kmer size: 15; skip: 5; is_hpc: 0; #seq: 61
[M::mm_idx_stat::84.320*2.22] distinct minimizers: 163936851 (36.46% are singletons); average occurrences: 5.538; average spacing: 3.005; total length: 2728222451
[E::sam_parse1] query name too long
[W::sam_read1_sam] Parse error at line 63
samtools view: error reading file "-"
The FASTQ reads I was using had QNAMEs that were too long, resulting in samtools having an issue converting the sam
alignment of the reads to the reference transcriptome into a bam.
I truncated all the query names and now the code runs fine.
Hello! I've been having this error for any run of 'read_analysis.py' I do with a specific set of FASTQ reads I have. From the error I thought it might be because of a bad reference transcriptome file but I managed to produce a model with a different set of FASTQ reads. Regardless of that I am curious if you could help me understand what was at fault in this case. My input is a set of human lrRNA ONT reads, that I subsampled using
seqtk
to 100,000 reads, in order to test the software. Here is the output I get when I encounter the issue: