pckroon / pysmiles

A lightweight python-only library for reading and writing SMILES strings
Apache License 2.0
147 stars 21 forks source link

Inconsistent writing and reading of mono-atomic smiles for Se and As #27

Closed TimoSommer closed 1 year ago

TimoSommer commented 1 year ago

If we have a graph featuring a single non-organic atom like Se, this will be outputted by the smiles writer as 'Se'. According to smiles rules, I would have expected '[Se]'. And then, if we again read in the smile 'Se' with the smiles reader, it will fail and output a graph featuring 'S', because it relies on the brackets to recognise non-organic elements. I don't know if this is an issue with the smiles reader or writer, but they are inconsistent.

pckroon commented 1 year ago

Oh, that's not supposed to happen. Thanks for finding and reporting this! The culprit is here: https://github.com/pckroon/pysmiles/blob/master/pysmiles/smiles_helper.py#L144 I need to dig a bit further to find why Se and As are special cases. If they should be special cases the issue lies in the reader.

I'll try to plan a day to do some maintenance...