aspuru-guzik-group / selfies

Robust representation of semantically constrained graphs, in particular for molecules in chemistry
Apache License 2.0
661 stars 127 forks source link

Kekulization Failure #120

Open Olabisi-Aishat-Bello opened 1 month ago

Olabisi-Aishat-Bello commented 1 month ago

When trying to run this line of code:

sf.encoder('Cc1ccc(NC(=O)c2ccc(-c3[c]n(Br)ccs[nH]3)c(C(F)(F)F)c2)cc1Nc1nccc(-c2cccnc2)n1')

I get this error even though RDKit smiles parser is able to produce a Mol object from the same smiles string

EncoderError: kekulization failed SMILES: Cc1ccc(NC(=O)c2ccc(-c3[c]n(Br)ccs[nH]3)c(C(F)(F)F)c2)cc1Nc1nccc(-c2cccnc2)n1

MarioKrenn6240 commented 1 month ago

Thank you, i can reproduce the behaviour, and we are looking into this right now. Will come back to you soon!

MarioKrenn6240 commented 1 month ago

Indeed, Robert Pollice identifies the problem, it makes a mistake in accounting for the valences for radicals. It should be easy to fix. We will do so hopefully soon and publish the fix as SELFIES2.1.3. Thanks for reporting this!

Olabisi-Aishat-Bello commented 1 month ago

Awesome! Thank you!