Closed xychem closed 1 year ago
Merging #158 (32a8733) into main (d0b9c79) will not change coverage. The diff coverage is
100.00%
.
@@ Coverage Diff @@
## main #158 +/- ##
=======================================
Coverage 40.64% 40.64%
=======================================
Files 8 8
Lines 556 556
=======================================
Hits 226 226
Misses 330 330
Files Changed | Coverage Δ | |
---|---|---|
DiverseSelector/methods/dissimilarity.py | 19.04% <100.00%> (ø) |
This is reason for why we get 8 and 6 data points instead of the queried 12 for "Adapted Optimizable K-Dissimilarity Selection (OptiSim)" #147
Can you also add a testing for this change? @xychem Thanks.
@marco-2023 can you please review this change to make sure it is resolving the issue you were in reproducing the old notebook results? Thanks.
I checked the earlier versions of the code and noticed that the tolerance was 5.0. However the predict_radius
implementation was changed, so it may not be used in the same way anymore.
@FarnazH & @xychem: I went through the old and new code. In the old code the tol
value of the object was interpreted by the predict_radius
as the percent
(of the number of samples to select) tolerance and in the new code is the fraction
(of the number of samples to select) tolerance. The proposed tol=0.05
corresponds exactly to the old code's tol=5.0
. The proposed change solves the issue with the method.
@marco-2023, you should have a This pull request is waiting on your review.
comment (in a yellow box) at the top of this PR. Based on your comment, please approve this PR so that it can be merged.
Thanks for your comments and review. @marco-2023 I am going to merge this PR for now.
The number of upper size means the max number we selested. upper_size = tol*size, if tol=5.0, the upper_size will be larger than we want.