gruenewald-lab / CGsmiles

Coarse-Grained Smiles (CGsmiles) for representing abitrarily complex molecules using a compact line notation
5 stars 2 forks source link

Chiral #34

Closed fgrunewald closed 5 days ago

fgrunewald commented 6 days ago

Chirality in SMILES sucks. That is an unequivocal truth.

The main reason is that it relates the order of the nodes in the string to the 3D configuration. For CGSmiles, this is a problem because it is not obvious at all in which order fragments or bonding descriptors in a complex molecule are added. As a user, one can forget about figuring out what the right order should be to get the correct chirality. That is a problem for many simple molecules (e.g. lipids) that only have one chiral center which is well-defined by R/S.

For molecules like sugars, it gets even more messed up. Here, the ordering of ring bonds plays a role for SMILES, but in CGSmiles we cannot guarantee this order to be the same as in the SMILES string. So, simply copying and splitting the SMILES string from another source would almost definitely give the incorrect answer.

Thus, CGSmiles will simply support annotation with @R, @S, and @RS. A chiral node will be for example [C@R]. That means the user needs to decide what a chiral center should be. Any interpretation of the order of the neighbors or what that means is left to the user.

Is it possible to mess up the notation and generate impossible molecules? For sure, but in my opinion, it is by far outweighed by simple molecules that now have the correct chirality annotation.

fgrunewald commented 5 days ago

will be replaced by annotation based definition