torognes / swarm

A robust and fast clustering method for amplicon-based studies
GNU Affero General Public License v3.0
121 stars 23 forks source link

Writing seeds > 100% #98

Closed lczech closed 7 years ago

lczech commented 7 years ago

Minor issue. Using Swarm 2.1.12 [Feb 14 2017 19:19:01] with the following invocation

${SWARM} \
    -d 1 -f -t ${THREADS} -z \
    -i ${FINAL_FASTA/.fasta/_1f.struct} \
    -s ${FINAL_FASTA/.fasta/_1f.stats} \
    -w ${TMP_REPRESENTATIVES} \
    -o ${FINAL_FASTA/.fasta/_1f.swarms} < ${FINAL_FASTA}

(almost verbatim Fred's metabarcoding pipeline)

gives the following output:

...
Writing seeds:     0%  
Writing seeds:     0%  
Writing seeds:     1%  
Writing seeds:     1%  
Writing seeds:     1%  
Writing seeds:     2%  
...

Writing seeds:     99%  
Writing seeds:     99%  
Writing seeds:     100%  
Writing seeds:     100%  
Writing seeds:     101%  
Writing seeds:     101%  
Writing seeds:     102%  
...

Writing seeds:     158%  
Writing seeds:     158%  
Writing seeds:     100%
Writing stats:     0%  
Writing stats:     0%  
Writing stats:     0%  
Writing stats:     1%  
...

So, there seems to be an issue in the calculations of those percentages. Maybe, the duplicate values (three times "1%") can also be avoided ;-)

torognes commented 7 years ago

Thanks, I'll look into this soon.

frederic-mahe commented 7 years ago

Hi @lczech,

thanks for catching that.

I was able to replicate the bug by writing stderr to a file (2> log). Progress values for the -z option ("Writing seeds") are larger than 100%. The wrong progress values are visible with less log, but not with cat log as the last value is 100%: ... Writing seeds: 149% ^MWriting seeds: 149% ^MWriting seeds: 100%

torognes commented 7 years ago

Fixed in version 2.1.13, just released.