wengong-jin / icml18-jtnn

Junction Tree Variational Autoencoder for Molecular Graph Generation (ICML 2018)
MIT License
509 stars 190 forks source link

Cannot load vocab -- mol is None #48

Open tawe141 opened 5 years ago

tawe141 commented 5 years ago

I get the following error with this code after generating the vocabulary:

from fast_jtnn import *
vocab = Vocab('data/moses/vocab.txt')
[16:21:52] SMILES Parse Error: syntax error for input: 'd'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-b4fa4df471ab> in <module>()
----> 1 vocab = Vocab('data/moses/vocab.txt')

/home/erictaw/icml18-jtnn/fast_jtnn/vocab.pyc in __init__(self, smiles_list)
     14         self.vocab = smiles_list
     15         self.vmap = {x:i for i,x in enumerate(self.vocab)}
---> 16         self.slots = [get_slots(smiles) for smiles in self.vocab]
     17         Vocab.benzynes = [s for s in smiles_list if s.count('=') >= 2 and Chem.MolFromSmiles(s).GetNumAtoms() == 6] + ['C1=CCNCC1']
     18         Vocab.penzynes = [s for s in smiles_list if s.count('=') >= 2 and Chem.MolFromSmiles(s).GetNumAtoms() == 5] + ['C1=NCCN1','C1=NNCC1']

/home/erictaw/icml18-jtnn/fast_jtnn/vocab.pyc in get_slots(smiles)
      5 def get_slots(smiles):
      6     mol = Chem.MolFromSmiles(smiles)
----> 7     return [(atom.GetSymbol(), atom.GetFormalCharge(), atom.GetTotalNumHs()) for atom in mol.GetAtoms()]
      8 
      9 class Vocab(object):

AttributeError: 'NoneType' object has no attribute 'GetAtoms'

This error arises when RDKit reads an invalid SMILES string. Why is this happening?