torognes / swarm

A robust and fast clustering method for amplicon-based studies
GNU Affero General Public License v3.0
121 stars 23 forks source link

fastidious boundary applies to the consolidated cluster mass? #178

Closed frederic-mahe closed 10 months ago

frederic-mahe commented 10 months ago

(question from a user)

short answer: yes

When using the fastidious option, the boundary threshold separates clusters into two groups: large and small. Small clusters can be grafted unto large clusters.

The default boundary value is 3. So, a cluster is small if it has a total abundance of 2 or less, meaning that it is composed of either one amplicon of abundance 2, or two amplicons of abundance 1, or one amplicon of abundance 1.

The boundary threshold applies to the consolidated cluster mass (sum of all abundances), not seed mass.

This question is now explicitly covered by 6 additional tests (see commit https://github.com/frederic-mahe/swarm-tests/commit/d4519509bd6cb40347383103ca3c4e6e321b0e80)

frederic-mahe commented 10 months ago

Documentation has been modified to make that distinction clearer (dev branch).