openforcefield / cmiles

Generate canonical molecule identifiers for quantum chemistry database
https://cmiles.readthedocs.io
MIT License
23 stars 7 forks source link

Assigning formal charge can lead to changes in hydrogen counts #21

Closed ChayaSt closed 5 years ago

ChayaSt commented 5 years ago

When loading a molecule for a QCArchive molecule where an atom should have a formal charge, the oechem.OEAssignFormalCharges(molecule) can add implicit hydrogens where we don't want them because it doesn't know the charge of the atom.

Example molecule: image

When loading this molecule from an sdf file where the atoms have charge assigned to them, the canonical SMILES is: 'c1ccc(cc1)c2[n-]nnn2' But when loading it from a QCArchive molecule in cmiles, the canonical SMILES is: c1ccc(cc1)c2[nH]nnn2

ChayaSt commented 5 years ago

Addressed by #20