kienerj / pycdxml

Tools to automatically convert and proccess cdx and cdxml files in python
GNU General Public License v3.0
35 stars 5 forks source link

Erroneous dashed bond shown #16

Closed baoilleach closed 1 year ago

baoilleach commented 1 year ago

Here's part of a structure where the original MOL file has a single dashed line, but the resulting CDXML has an additional dashed line to the acid OH. image

Looking for the source of the problem, I've had a quick look at the code for _set_end_wedge_display_style but I wasn't sure about the "if b.GetIdx() in bonds" as it seems it might give different results depending on the order in which the bonds are present in the molecule. Wouldn't it be simpler to defer the handling of this until after all of the bonds have been processed with a first pass, and then have a second pass looping over the bonds doing whatever is needed to tidy up?

kienerj commented 1 year ago

Can you share the original molecule or one that is causing the issue? Just wondering the error appears in all similar cases ("wedge to acid") or if the rest of the molecule matters.

In general yes I'm sure there are better ways to do it. As far as I remember the core "reasoning" was to be able to build the cdxml sequentially, create each element fully and then move to the next element vs. partial element creation and then finding the element again inside the xml and complete it. This would also mean some kind of mapping that maps from rdkit bond idx to a specific cdxml element.

The reasoning for the whole "crutch" is in the method doc string. Test files would fail to show dashed/wedge bonds without having a "WedgedHashEnd" or "WedgeEnd" set. However I can't reproduce this right now with a basic example that displays properly without a "WedgedHashEnd" .

Anyway will look into this.

baoilleach commented 1 year ago

Unfortunately, I can't share this molecule. but I'll try and generate an example. Thanks.

kienerj commented 1 year ago

Simply removing the _set_end_wedge_display_style method should fix the issue (see latest commit). I'm not sure why it was needed at some point to get the wedge displayed in ChemDraw. It's possible that removing it will break different molecules albeit on my side it works fine.

(note: also fixed that molecules were flipped vertically due to different coordinate system)

Fixed by e2fefc82a44bda17b5c91de208947185f36ecaad

baoilleach commented 1 year ago

Thanks @kienerj.

baoilleach commented 1 year ago

Just a note to say that I've generated 15 orderings of the same small structure, to see whether there was any problem and it looks fine to me. That is to say, all 15 have a problem before, and all 15 are fixed afterwards (apart from the stereo problem reported in #17). testcases.sdf.txt