theochem / Selector

Methods for selecting diverse (molecular) database.
https://selector.qcdevs.org
GNU General Public License v3.0
22 stars 20 forks source link

Add fix to hypersphere_overlap_of_subset #187

Closed marco-2023 closed 6 months ago

marco-2023 commented 7 months ago

The hypersphere_overlap_of_subset method failed if the data had any redundant feature. This is a pretty common case, e.g. finding a diverse set of alcohols (all binary chains will have a common feature). This was making grid_partitioning methods to fail.

The fix removes redundant coordinates before normalizing and computing diversity. Now it only fails if all binary chains in the data are the same.