fulcrumgenomics / fgbio

Tools for working with genomic and high throughput sequencing data.
http://fulcrumgenomics.github.io/fgbio/
MIT License
311 stars 67 forks source link

DemuxFastqs --omit-failing-reads opposite expected behavior. #922

Closed gouinK closed 1 year ago

gouinK commented 1 year ago

Hi all,

I attempted to use the DemuxFastqs function with the --omit-failing-reads flag set, and I noticed that it outputs reads that have the chastity flag set to Y, however I believe the expected behavior is to output reads that have the chastity flag set to N.

It seems like in the original PR https://github.com/fulcrumgenomics/fgbio/pull/713 that added this feature, the intended behavior was indeed to output reads with chastity flag set to Y, so perhaps there was just a misunderstanding about what the Y vs N means. Based on Illumina's documentation when the flag is set to Y it means that the read failed filtering.

Maybe I am mis-interpreting the meaning of the --omit-failing-reads flag?

Thanks for your time!

nh13 commented 1 year ago

@gouinK I think you're right, and here's the offending line: https://github.com/fulcrumgenomics/fgbio/blob/e197ce44233c4a6dcbbb8d9f5da6102cd4209e75/src/main/scala/com/fulcrumgenomics/fastq/DemuxFastqs.scala#L916

As a heads up, most of the work on FASTQ demultiplexing has moved over to fqtk, though we do not intend to port over all options like this one (or BAM output support; use samtools import). It should be MUCH faster for you.

gouinK commented 1 year ago

Thanks for the clarification and the tip on the new tool!

nh13 commented 1 year ago

Fixed in #923