Open jthorton opened 3 years ago
Linking to the uber-issue on this at #146.
I think we'll ultimately want to have our own "OFF stereochemistry" model, where we have SMARTS patterns for the atoms/bonds that WE consider stereogenic, and then add some complexity to the OE/RDKit interfaces to enforce this definition of sterechemistry (where hopefully our stereogenic groups are a subset of their stereogenic groups, so it'll be an exercise in selectively ignoring some stereo in those cheminformatics toolkits)
Describe the bug Undefined stereo errors seem to depend on the backend toolkit used to create the molecule see the example below I think this is mainly due to nitrogen stereochemistry being picked up or not.
This also causes an error in QCSubmit when enumerating stereochemistry, in the workflow we have a component which enumerates all undefined stereochemistry and checks that the molecule is valid before adding it to a dataset by doing a round trip and looking for undefined stereo errors. In the case of this molecule, the nitrogen stereo chem is enumerated and 2 isomers are made but no new molecules are returned from the method as the toolkit makes sure to not return the same molecule and as nitrogen stereo chem is no longer checked in the isomorphism check the toolkit thinks the two new molecules are the same as the input and they are removed. Then when we do the round trip test on the input molecule it fails as openeye identifies missing stereochemistry meaning any input molecules with only undefined nitrogen stereochemistry are accidentally removed when building datasets.
To Reproduce