gablg1 / ORGAN

Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models
GNU General Public License v2.0
237 stars 85 forks source link

How do you compute Druglikeness in the paper? #5

Closed benstaf closed 7 years ago

benstaf commented 7 years ago

How do you compute Druglikeness in the paper?

Page 5, you say that Druglikeness is composed of a linear combination of Novelty, Diversity, Solubility and Synthetizability, without specifying the coefficients of this linear combination.

In the code (line 313 of mol_metrics.py), druglikeness seems to be the arithmetic mean of those parameters. However, this formula fails to give the figures announced in the paper.

To figure out those coefficients, I tried to solve the linear system of equations (with just 8 equations), but it is impossible again to find a solution. Can you provide clarifications please? linearequations

couteiral commented 7 years ago

Hi.

You're right: there is an error in the paper. The _drugcandidate metric is a combination of solubility, novelty, synthesizability and conciseness. Just citing the corresponding function in the mol_metrics.py file:

def drug_candidate(smile, train_smiles):
    good_logp = constant_bump(logP(smile), 0.210, 0.945)
    sa = SA_score(smile)
    novel = soft_novelty(smile, train_smiles)
    compact = conciseness(smile)
    val = (compact + good_logp + sa + novel) / 4.0
    return val

However, it is possible than in a previous stage of development this function was as said in the paper. You could try changing the previous snippet for this one, and optimizing:

def drug_candidate(smile, train_smiles):
    good_logp = constant_bump(logP(smile), 0.210, 0.945)
    sa = SA_score(smile)
    novel = soft_novelty(smile, train_smiles)
    diverseness = diversity(smile)
    val = (diverseness + good_logp + sa + novel) / 4.0
    return val

Anyway, if you are interested in generating drug-like molecules, check our recent work in chemically-oriented ORGANs. Among other things it implements Lipinski's rule-of-five and chemical beauty scores.

Feel free to ask any questions, Carlos