smithlabcode / falco

A C++ drop-in replacement of FastQC to assess the quality of sequence read data
https://falco.readthedocs.io
GNU General Public License v3.0
96 stars 10 forks source link

per base sequence quality fix: https://github.com/smithlabcode/falco/… #27

Closed Shelestova-Anastasia closed 2 years ago

Shelestova-Anastasia commented 2 years ago

https://github.com/smithlabcode/falco/issues/25

When we calculate per base sequence quality by group (for example 10-14) - we need to summarize percentiles for each base position in group and then divide sum by base positions number in group.

Now the results are almost as fastqc. The difference is only percentile_thresh calculation. Seems your thresholds are more accurate.

Fastqc calculates trash as long: long percentile_thresh = totalCounts percentile / 100; Falco calculates as double - for example: ldecile_thresh = 0.1 bases_in_group;