This is rather questionable, but htslib can read in empty fastq records and generates "*" for SEQ and QUAL. The same is true vice versa now. Eg:
name 4 * 0 0 * * 0 0 * *
becomes
@name
+
Bwa mem and minimap2 can both read these fastq entries too, although the SAM output is bugged as it outputs an empty field instead of "*" for SEQ. (Note this is still readable by htslib and interpreted the same).
Potential reasons for accepting this:
When dealing with paired data, we don't want to output a differing number of records from samtools fastq if read1 has seq and read2 has "*".
Note as this filtered at the htslib layer, it's not considered as a singleton so fastq -s won't rescue this.
At least some aligners apparently support this format. Although inevitably they just produce unmapped data.
Arguably this is a case of silly input => silly output!
Users can manually elect to "samtools view -e 'length(seq) > 0'" before using samtools fastq, which then fixes the not-a-singleton problem.
It converts samtools fastq output back to how it was in pre 1.13 era, where we rewrote it to use htslib's interfaces.
Potential reason to reject:
It may yield output which trips up some poorly written tools.
On discussion, we decided that avoiding mis-paired reads was preferable to avoiding empty sequences. I suspect this rarely happens, but I guess we may find out if anyone notices...
This is rather questionable, but htslib can read in empty fastq records and generates "*" for SEQ and QUAL. The same is true vice versa now. Eg:
becomes
Bwa mem and minimap2 can both read these fastq entries too, although the SAM output is bugged as it outputs an empty field instead of "*" for SEQ. (Note this is still readable by htslib and interpreted the same).
Potential reasons for accepting this:
When dealing with paired data, we don't want to output a differing number of records from samtools fastq if read1 has seq and read2 has "*".
Note as this filtered at the htslib layer, it's not considered as a singleton so fastq -s won't rescue this.
At least some aligners apparently support this format. Although inevitably they just produce unmapped data.
Arguably this is a case of silly input => silly output!
Users can manually elect to "samtools view -e 'length(seq) > 0'" before using samtools fastq, which then fixes the not-a-singleton problem.
It converts samtools fastq output back to how it was in pre 1.13 era, where we rewrote it to use htslib's interfaces.
Potential reason to reject:
Fixes samtools/samtools#1799