torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
671 stars 125 forks source link

--fastq_stats returns erroneous cumulated percentage for empty reads #571

Closed frederic-mahe closed 2 months ago

frederic-mahe commented 2 months ago

This is a minor bug. It can be reproduced like this:

printf "@s\n\n+\n\n" | \
    vsearch \
        --fastq_stats - \
        --log -

returns:

Read length distribution
      L           N      Pct   AccPct
-------  ----------  -------  -------
>=    0           1   100.0%   1844674407370955161600.0%

In the example above, the fraction of reads with this length zero or longer (AccPct) should be 100.0%. This seems to be caused by an out-of-bounds error: https://github.com/torognes/vsearch/blob/a267c0f6c24683d0d3e201348ee12de444b3f49e/src/fastqops.cc#L285

When i == 0, length_dist[i - 1] tries to read entry -1 of the array length_dist.

torognes commented 2 months ago

Should be fixed in commit 2801e61.

frederic-mahe commented 2 months ago

I confirm that tests are now ok