connorcoley / scscore

MIT License
91 stars 40 forks source link

Errors of generating SCScore for a given molecules #3

Open chengthefang opened 6 years ago

chengthefang commented 6 years ago

Dear Connor,

This is Cheng. I was using scscore python codes to evaluate the synthetic complexity for the molecules of my interest. However, when I implemented standalone_model_numpy.py, some errors pop up. I think I might make some mistakes that you could help me figure it out.

First, I loaded the modules as follows:

import standalone_model_numpy from standalone_model_numpy import SCScorer model.restore() Restored variables from /Users/cheng/Downloads/scscore-master/models/full_reaxys_model_1024bool/model.ckpt-10654.as_numpy.pickle <standalone_model_numpy.SCScorer object at 0x1037512b0>

Then, I tried to predict the score for a given molecule, e.g 'mol'

mol = 'CCCNc1ccccc1' (smi,score)=model.get_score_from_smi(mol)

But I got the error message:

Traceback (most recent call last): File "", line 1, in File "/Users/cheng/Downloads/scscore-master/scscore/standalone_model_numpy.py", line 81, in get_score_from_smi fp = np.array((self.smi_to_fp(smi)), dtype=np.float32) File "/Users/cheng/Downloads/scscore-master/scscore/standalone_model_numpy.py", line 62, in smi_to_fp return self.mol_to_fp(self, Chem.MolFromSmiles(smi)) File "/Users/cheng/Downloads/scscore-master/scscore/standalone_model_numpy.py", line 53, in mol_to_fp useChirality=True), dtype=np.bool) Boost.Python.ArgumentError: Python argument types in rdkit.Chem.rdMolDescriptors.GetMorganFingerprintAsBitVect(Mol, int) did not match C++ signature: GetMorganFingerprintAsBitVect(RDKit::ROMol mol, int radius, unsigned int nBits=2048, boost::python::api::object invariants=[], boost::python::api::object fromAtoms=[], bool useChirality=False, bool useBondTypes=True, bool useFeatures=False, boost::python::api::object bitInfo=None)

No idea about what happened there. Appreciated much for your suggestions, especially how to generate SCScore for a library of molecules.

Thanks, Cheng

connorcoley commented 6 years ago

Hi Cheng,

It's not immediately obvious to me what is causing your error. What version of RDKit are you using? I am currently using 2017.09.1 without any issues; I don't believe that the arguments to the fingerprinting function have changed. Can you try running the following and let me know if it raises any errors?

from rdkit.Chem.rdMolDescriptors import GetMorganFingerprintAsBitVect
import rdkit.Chem as Chem 

smi = 'CCCNc1ccccc1'
mol = Chem.MolFromSmiles(smi)
if mol is None:
    raise ValueError('Invalid smiles')
fp = GetMorganFingerprintAsBitVect(mol, 2, nBits=1024, useChirality=True)
print(fp)
chengthefang commented 6 years ago

Hi Connor,

I think you are probably correct. I am using RDKit v2015.09.01 via "conda install -c omnia rdkit" command on my mac. I run your codes, and had the same errors. So it might be due to the version issue.

However, I came across a hard time installing the latest RDKit with "http://www.rdkit.org/docs/Install.html" method, i.e, I kept installing python 3.6.5 via Conda that seems not to be compatible with RDKit on Mac. I wonder if there is a better way to install the latest RDKit on Mac.

Thanks, Cheng

connorcoley commented 6 years ago

Cheng,

Ah okay, I think that upgrading RDKit should solve your issue. I recall seeing some posts about the new builds of RDKit (toward the end of 2017) not working right with the most recent conda. I can't remember if downgrading conda or downgrading python was the solution.

For what it's worth, I recently installed RDKit on a new mac using conda version 4.3.34, python 3.6.1, and RDKit version 2017.09.3.

Connor

chengthefang commented 6 years ago

@connorcoley Hi Connor, thank you much for your reply. I have found out one solution that allows me to update to RDKit 2017 version with downgraded Conda version (i.e python 3.5). With that, it works well with you test code. I will try to generate the SCScore for my molecules. Hopefully, everything goes well.

Thanks, Cheng