Open MueFab opened 4 years ago
This is also applicable to fastq decompression on the feature/paired-end branch when you don't specify --combine-pairs. You get some records with only one segment, and some records with 2. It might be worthwhile to write out the records with 1 segment into separate files (can potentially have 4 files: matched_1.fastq, matched_2.fastq, unmatched_1.fastq, unmatched_2.fastq)
Describe the bug If paired and unpaired unaligned data units are contained in a single mgb file and that file is decompressed into fastq, unpaired records are written into the output files alongside the paired records. This creates an offset and destroys the pairing information of the paired records.
To Reproduce
Expected behavior A better way would be to generate three fastq files, 2 for paired records and one for all single ended reads, therefore preserving the pairing information. Unused files could be deleted when the application shuts down.