lskatz / fasten

:construction_worker: Fasten toolkit, for streaming operations on fastq files
MIT License
76 stars 6 forks source link

Feature Request: handle seqlen != quallen #15

Closed chrisgulvik closed 1 year ago

chrisgulvik commented 1 year ago

Can you explicitly state (perhaps in readme or help menu) which tests are used for FastQ validation in fasten_validate? As far as I can tell in the src the seqlen and quallen aren't compared, or am I wrong?

I've only ever seen 2 options to handle this issue when Field 2 and Field 4 lengths aren't the same:

  1. repair:: trim to the lowest length
  2. clean/remove: discard the read entirely

Both options some might find useful in something like --seq-and-qual-len-diff [repair,remove] but I think remove is the safer more ideal option for quality concerns if only 1 can be implemented. The remove creates a broken sister read pair, so it might be tougher to implement.

lskatz commented 1 year ago

We're getting closer on the fasten-repair branch. It will be something like

./target/debug/fasten_inspect  < testdata/four_reads.fastq | ./target/debug/fasten_repair > repaired.fastq

Right now it seems to work but will panic instead of repair when it comes across errors.

lskatz commented 1 year ago

Fixed in v0.6

lskatz commented 1 year ago

with fasten_inspect and fasten_repair