cxcsds / ciao-contrib

Extra scripts and code to enhance the capabilities of CIAO.
GNU General Public License v3.0
8 stars 6 forks source link

merge_too_small add join parameter #859

Closed kglotfelty closed 6 months ago

kglotfelty commented 6 months ago

This closes #857

Users can now select whether to combine deficient groups with the smallest (min) or largest (max) neighbor. Default set to min to match existing code. The existing docs were wrong. Doing some testing, it's easy to see why max is not a good default since. Consider a case where lots of deficient groups are clustered together. With max, they will all tend to coalesce into a single mega group whereas with min they will tend to cluster into several smaller groups that just exceed the threshold. Of course the single mega group may be what some folks want -- that's why we added it as an option.