smithlabcode / falco

A C++ drop-in replacement of FastQC to assess the quality of sequence read data
https://falco.readthedocs.io
GNU General Public License v3.0
94 stars 10 forks source link

changes in FASTQC 0.12: Adjust default adapters and add total base count #64

Closed wm75 closed 3 weeks ago

wm75 commented 1 month ago

v0.12 of fastqc from last year changed the default adapters it searches and now no longer reports SOLID adapters, but polyA and polyG instead.

see: https://github.com/s-andrews/FastQC/releases/tag/v0.12.0

The new release also now reports the total base count in basic stats, which would be good to have reflected in falco I guess.

Other changes in the fastqc release look less important at first glance (though maybe the deduplicated sequences line still present in falco will confuse people even more now that it got removed from fastqc).

andrewdavidsmith commented 1 month ago

Thanks for this. I hope to get to it soon.

bgruening commented 4 weeks ago

@andrewdavidsmith could we also get a new release? We would really push for falco instead of fastqc. But this and a new release is blocking us atm.

Thanks a lot.

andrewdavidsmith commented 4 weeks ago

@wm75 @bgruening I've taken care of the adapter and polyA/polyG stuff. Let me know if your use of falco depends on the total base count and/or the duplicated sequence thing. If it does, I'll try to prioritize it. Otherwise I'll make a new release of falco right away with the adapter/contaminant stuff updated.

wm75 commented 4 weeks ago

Thanks @andrewdavidsmith! I think the adapter report is the most important change and a release containing this (and the bam rv-reads fix) would be great. I don't mind the extra line in the duplicates plot at all, and the base count is a nice-to-have if you have the time to include it but not crucial.

andrewdavidsmith commented 3 weeks ago

Ok, I made a release (v1.2.3) and it takes care of the adapter and contaminant lists issue, along with any changes since the previous release. If either of you @wm75 or @bgruening want to open a separate issue and just paste in those other suggestions, please do so and I'll get to it when I have time. The more bite-sized the work seems at a glance, the more likely I am to squeeze it into my schedule. However, if v1.2.3 has introduced any problems (I only tested it on 2 systems plus GH runners), then I can address it right away.

I'll also keep an eye on conda and if they don't auto-update the version in a couple days, I'll do it manually.