My testing showed that MolScore's version of some of the MolOpt (GuacaMol) functions differs from that of the original package. This PR applies some (partial) fixes.
Fingerprint size: the original GuacaMol functions use a $2^{32}$ bit (sparse) fingerprints, whereas the implementations in MolScore use a 1024 bit (dense) fingerprint. This causes all similarity values to be slightly overestimated. I increased the fingerprint length to 16384 to reduce the overestimation.
Fix Deco hop threshold: the implementation here used a threshold of 0.75, but the paper and code use 0.85.
Align some MPO modifiers with GuacaMol's code in instances where paper and code differ. I didn't realize that GuacaMol's official implementation of their functions differs from the paper in several places. In these instances, I think it is better to side with their official code, because this is what prior works have used.
Fexofenadine_MPO: the paper specifies an STD of 2 for the TPSA and logP modifies, but the code uses 10 and 1 respectively. This causes scores to generally be a lower in GuacaMol's implementation compared to MolScore.
Osimertinib_MPO: sigmas are also different (original code)
Fixed SMARTS string in Scaffold hop (in response to #45 )
Fixed bug in isomer similarity calculation: the GuacaMol paper and code specify that the element-wise difference is taken with respect to elements in the target molecule only, while MolScore's implementation uses both the target and query molecules. A simple 1-line change fixes this.
Fixed fingerprint lengths in legacy QSAR: JNK3 and GSK3B both use radius 2 fingerprints, not radius 3.
Also, I removed a spurious resource called molscore.configs.MolOpt-DF which does not exist in the repo (maybe it exists for you locally??)
Currently I have this PR marked as a draft. This is because
The functions still don't seem to be 100% aligned (I still found small differences in Sitagliptin MPO, Valsartan SMARTS, and scaffold hop)
If these fixes are accepted, they should be made in other places in the repo.
My testing showed that MolScore's version of some of the MolOpt (GuacaMol) functions differs from that of the original package. This PR applies some (partial) fixes.
Also, I removed a spurious resource called
molscore.configs.MolOpt-DF
which does not exist in the repo (maybe it exists for you locally??)Currently I have this PR marked as a draft. This is because