theochem / Selector

Methods for selecting diverse (molecular) database.
https://selector.qcdevs.org
GNU General Public License v3.0
22 stars 20 forks source link

[Similarity module] Scaled similarity support #199

Closed Dhrumil07 closed 3 months ago

Dhrumil07 commented 3 months ago

Adding the support of scaled similarity matrix in the similarity module by implementing the method.

Considering $S(i,j)$ denotes the similarity between $i^{th}$ sample and $j^{th}$ sample, formula used to calculate scaled similarity matrix is $S(i,j)={S(i,j)\over\sqrt{S(i,i)S(j,j)}}$

Fixes #122

codecov[bot] commented 3 months ago

Codecov Report

Merging #199 (b9c1ee6) into main (85b0aba) will decrease coverage by 0.26%. Report is 1 commits behind head on main. The diff coverage is 82.35%.

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/theochem/Selector/pull/199/graphs/tree.svg?width=650&height=150&src=pr&token=0UJixrJfNJ&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theochem)](https://app.codecov.io/gh/theochem/Selector/pull/199?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theochem) ```diff @@ Coverage Diff @@ ## main #199 +/- ## ========================================== - Coverage 96.47% 96.21% -0.26% ========================================== Files 9 9 Lines 935 951 +16 ========================================== + Hits 902 915 +13 - Misses 33 36 +3 ``` | [Files](https://app.codecov.io/gh/theochem/Selector/pull/199?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theochem) | Coverage Δ | | |---|---|---| | [selector/similarity.py](https://app.codecov.io/gh/theochem/Selector/pull/199?src=pr&el=tree&filepath=selector%2Fsimilarity.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=theochem#diff-c2VsZWN0b3Ivc2ltaWxhcml0eS5weQ==) | `94.73% <82.35%> (-5.27%)` | :arrow_down: |
Dhrumil07 commented 3 months ago

For binary similar matrices having all diagonal elements as 1, the scaled_similarity_matrix() function just prints a message "No scaling is taking effect" and returns the original matrix. For other similarity matrices, it gives the scaled similarity matrix as the output.

Dhrumil07 commented 3 months ago

@FanwangM can you please review the PR

Dhrumil07 commented 3 months ago

@FanwangM thank you for the suggestions. I have made the required changes.