akensert / molgraph

Graph neural networks for molecular machine learning. Implemented and compatible with TensorFlow and Keras.
https://molgraph.readthedocs.io/en/latest/
MIT License
47 stars 5 forks source link

nan in loss with those molecules #18

Closed thegodone closed 1 year ago

thegodone commented 1 year ago

basically due to the features.GasteigerCharge() that can be null few molecules with like whit "P" atoms will produce nans during training. Can you replace the nan features for those molecules by 0.0 in such case ?

akensert commented 1 year ago

This is the current implementation:

@atom_features.register(name='gasteiger_charge')
class GasteigerCharge(Feature):
    def __call__(self, atom: Chem.Atom) -> float:
        mol = atom.GetOwningMol()
        rdPartialCharges.ComputeGasteigerCharges(mol)
        val = atom.GetDoubleProp('_GasteigerCharge')
        return val if val is not None else 0.0

Isn't it already taking care of it? Can you give me an example molecule that causes the issue and I'll look into it.

thegodone commented 1 year ago

this is a molecule that failed : ['OCPH(CO)CO'] I add GasteigerCharges and it goes to nan I remove it and it works. I think the vales are not None but Nan.

EDIT string was interpreted by github here the correct smiles:

'OC[PH](CO)(CO)CO'
akensert commented 1 year ago
smiles = 'OCPH(CO)CO'
mol = molgraph.chemistry.molecule_from_string(smiles)
ValueError: Inputted `molecule` (None) is invalid

Cannot convert SMILES to an rdkit mol object. Also tried with rdkit.Chem.MolFromSmiles(smiles, sanitize=False)

thegodone commented 1 year ago

I fix the smiles in github issue page see upper