Hi all;
I've been using SNAP with streaming input where I create an interleaved fastq input and fix errors in the fastq files using awk (adding /1 and /2 to read names, removing spaces, things like that). As a result the input goes streamed into SNAP. We regularly see error messages that indicate the input gets truncated not at line endings, some examples:
FASTQ file - has invalid starting character at offset 962920112, line type 0, char 9
Line in question: '9929.102677888_HWI-ST211R_330:5:2106:18153:132057/2'
SNAP exited with exit code 1 from line 267 of file SNAPLib/FASTQ.cpp
FASTQ file - has invalid starting character at offset 2080744180, line type 0, char C
Line in question: 'CTTGATAAGGATTGGGGCTGGGGGGTTTCCTTAGGGACGACCTGGCCCAGCTGCCCTTCCTGACCATGTGCATTAAGGAGAGCCTG'
SNAP exited with exit code 1 from line 267 of file SNAPLib/FASTQ.cpp
FASTQ file - has invalid starting character at offset 598685188, line type 0, char G
Line in question: 'GATTTGAAGGTCTGATGATGCCACATTAGGAGGCGGGCGG'
SNAP exited with exit code 1 from line 267 of file SNAPLib/FASTQ.cpp
I tried to reproduce a minimal example but can't seem to manage to cause the error without larger inputs. Roughly, I'm running the analysis like:
I've tried various buffering workarounds on the command line with minimal success (sometimes it fixes it, sometimes not) but am a bit stuck at how best to proceed. Is there anything that could get fixed on the SNAP side to handle streaming inputs?
Thanks for taking a look and happy to provide any other information that would help.
Hi all; I've been using SNAP with streaming input where I create an interleaved fastq input and fix errors in the fastq files using awk (adding
/1
and/2
to read names, removing spaces, things like that). As a result the input goes streamed into SNAP. We regularly see error messages that indicate the input gets truncated not at line endings, some examples:I tried to reproduce a minimal example but can't seem to manage to cause the error without larger inputs. Roughly, I'm running the analysis like:
I've tried various buffering workarounds on the command line with minimal success (sometimes it fixes it, sometimes not) but am a bit stuck at how best to proceed. Is there anything that could get fixed on the SNAP side to handle streaming inputs?
Thanks for taking a look and happy to provide any other information that would help.