kienerj / pycdxml

Tools to automatically convert and proccess cdx and cdxml files in python
GNU General Public License v3.0
38 stars 5 forks source link

Coordinates regenerated despite being provided #26

Closed baoilleach closed 1 year ago

baoilleach commented 1 year ago

For the structure CHEMBL488297, instead of just depicting the structure as provided, conversion from MOL to CDXML causes the coordinates to regenerate. It also inverts the stereo. CHEMBL4889297.mol.txt CHEMBL4889297.cdxml.txt

The only weird thing about this file (and another I looked at) is that the Z coordinates are all -0.0000 instead of just 0.0000. Does RDKit fail to recognise this as a 2D conformer? Not sure what's with the stereo inversion - something to do with the difference in coordinates between MOL file and CDXML?

MOL file: image CDXML: image

kienerj commented 1 year ago

I will look into it. likley it's caused by pycdxml code as I'm checking if there are 3D coordinates and if yes generate 2D. Possibly that the -0.0 somehow triggers that 2D coordinates generation. (Possibly floating point accuracy issue and it's not 0?)

kienerj commented 1 year ago

fixed with eade22ec0355f055a597e85d8cb4e5a2464a22c8

Indeed as expected determining if a 3D structure (recompute 2D) or a 2D structure (generate 2d) is present was wrong and fails in case the maximum point (atom) is exactly at 0 0 0 and all others are negative coordinates.

Weber it doesn't explain why 2D coordinate regeneration using coordgen:

 rdCoordGen.AddCoords(mol)
 mol.UpdatePropertyCache()

changes stereo chemistry. So there might be a bug in that code as well for this specific molecule?

Additional which bond is wedged still changes compared to the molfile as I'm using rdkits wedging so this molecule looks like below in cdxml:

image