torognes / swarm

A robust and fast clustering method for amplicon-based studies
GNU Affero General Public License v3.0
123 stars 23 forks source link

Control the range of `d` values #76

Closed frederic-mahe closed 7 years ago

frederic-mahe commented 8 years ago

The -d option controls the local clustering threshold used by swarm, -d 1 by default. The documentation says:

Any integer between 0 and 256 can be used.

As of now, zero or any positive integer value can be used (8,000,000 for instance). Using large d values doesn't really make sense though, as pairwise alignment results become less and less reliable when the number of differences increases. Therefore, we should only accept d values in the range 0-255 (included).

An error message should be emitted if d is set too high. The present message says:

Error: Illegal resolution specified.

It could be replaced with a more informative message. For instance:

Error: number of differences -d must be in the range 0 to 255.

Internally, d value can be stored in a uint, but the value should be constrained into a byte range.

torognes commented 7 years ago

Check for correct range (0-255) and improved error message added.

frederic-mahe commented 7 years ago

implemented in swarm 2.1.13