MolecularAI / reinvent-scoring

Apache License 2.0
10 stars 15 forks source link

score_transformations #4

Closed marcostenta closed 1 year ago

marcostenta commented 1 year ago

Hi guys, great work. I would recommend a minor change in the definition of the transformation equations in the TransformationFactory Class https://github.com/MolecularAI/reinvent-scoring/blob/reinvent.3.2/reinvent_scoring/scoring/score_transformations.py this is mostly an aesthetic but could help the readability of the code.

The sigmoid and inverse sigmoid are defined with two equations that have a slightly different functional form, whereas the only difference should be a minus sign on the slope. def _exp(pred_val, low, high, k) -> float: return math.pow(10, (10 k (pred_val - (low + high) * 0.5) / (low - high))) transformed = [1 / (1 + _exp(pred_val, _low, _high, _k)) for pred_val in predictions]

def _reverse_sigmoid_formula(value, low, high, k) -> float: try: return 1 / (1 + 10 * (k (value - (high + low) / 2) * 10 / (high - low))) except: return 0 transformed = [_reverse_sigmoid_formula(pred_val, _low, _high, _k) for pred_val in predictions]

Could something like that work (inverting high and low)? def _exp(pred_val, low, high, k) -> float: return math.pow(10, (10 k (pred_val - (low + high) * 0.5) / (high-low ))) transformed = [1 / (1 + _exp(pred_val, _low, _high, _k)) for pred_val in predictions]

another question: Cummins uses a similar function but uses the natural exponential e^^x (if I understand correctly what exp is in their R implementation) you use a power of 10 instead: is there a deliberate reason for it? their utility function takes the form of 1/(1+exp((A1-(ul+ll)/2) slope/(ul-ll) ))) (1-shift) + shift
https://pubs.acs.org/doi/full/10.1021/acs.jmedchem.5b01338

halx commented 1 year ago

Many thanks for this. We will be reworking how scoring works so this may come in handy. We probably will also try to see if we can make adding transforms easier. I am not sure if there is any particular reason for said functional form but it is practical and does its job.

marcostenta commented 1 year ago

Hi Hannes, Thanks for the answer The reason for asking is that we are working on (and hopefully will release soon) a python package dedicated to multiparameter scoring. Among other things, it will include a set of desirability functions, some uncertainty handling, sensitivity analysis, and error propagation. The idea behind it is to unify the math part of the scoring across a number of applications. We have a set of tools for direct (virtual screening, progression decision, et) and inverse (ex reinvent). We find it increasingly cumbersome that each of these applications has its own multiparameter scoring handling. It is impossible to compare, and we need to reconstruct an input (and parse the output) differently for each.

I will drop you a mail when it is out, and possibly, I would like to discuss with you making it available alongside your new reinvent the multiparameter scoring system because this would greatly simplify the integration of reinvent in our scientific workflow.

Cheers, m

From: Hannes Loeffler @.> Sent: Sonntag, 5. März 2023 16:52 To: MolecularAI/reinvent-scoring @.> Cc: Stenta Marco CHST @.>; Author @.> Subject: Re: [MolecularAI/reinvent-scoring] score_transformations (Issue #4)

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.


Many thanks for this. We will be reworking how scoring works so this may come in handy. We probably will also try to see if we can make adding transforms easier. I am not sure if there is any particular reason for said functional form but it is practical and does its job.

— Reply to this email directly, view it on GitHubhttps://github.com/MolecularAI/reinvent-scoring/issues/4#issuecomment-1455129048, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AWLLZBEUAIKNRSKPXWQLEUDW2SZCPANCNFSM6AAAAAAT2MZATE. You are receiving this because you authored the thread.Message ID: @.**@.>>

This message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.