Open marcelm opened 1 year ago
For my intention of using this as a replacement for bwa mem
in NGLess, we do need the mixed-format.
It is very common that real datasets are in this format (often because they start out as paired-end and, due to QC, some sequences lose their mate). Technically, the main advantage is that we can then stream the reads as a pipe and, otherwise, we need to create the files on disk.
I would strongly prefer to keep compatibility with bwa mem -p
as much as possible. I can check how it handles the /1
-/2
suffixes.
I would strongly prefer to keep compatibility with
bwa mem -p
as much as possible.
Absolutely. If this is something that you rely on in practice, then of course this behavior should be kept. We do need to document it a bit better, though.
Hi @luispedro. I was just looking into #273, which is about the
--interleaved
option. It currently allows mixing single ends with paired ends. One issue with the current implementation is that the reads are reordered in the output (within each chunk, pairs come first, then singles), which is a bit unexpected from a user’s point of view. It’s also a bit unexpected that--interleaved
allows this mixing at all. While I was planning how to fix the reordering and how to better explain how--interleaved
works, I started wondering whether this type of mixed input is actually something that should be supported.So my question: Are you actually using this or do you know anyone relying on this behavior? Because the easiest fix would be to have
--interleaved
mean only what it says and just disallow mixed inputs.cf #213