ideaconsult / cdk

The Chemistry Development Kit
https://cdk.github.io/
GNU Lesser General Public License v2.1
0 stars 0 forks source link

@Ignored some test cases #16

Closed uli-f closed 1 year ago

uli-f commented 1 year ago

I added @Ignore annotation to some of the test cases that fail in PR #15.

@ntk73 Once PR #15 is merged, I would appreciate it if you could have a look. Seems to have something to do with stereochemistry.

ntk73 commented 1 year ago

OK, I will check these

uli-f commented 1 year ago

These test cases are now in PR #18.

ntk73 commented 1 year ago

@uli-f , I suppose that a possible reason for failures is that RAuxInfo with the coordinates is not given. The stereo information is treated fully only on level 2D or 3D. From the original RInChI examples there are some double conversion tests with stereo info which do not fail, but they have RAuxInfo (e.g. testExample_Inverted_stereochemistry()).

BTW: I do not like this at all, but this is the approach implemented in native code - stereo handling is linked with MDL 2D/3D coordinates.

uli-f commented 1 year ago

Okay, thank you for clarifying.

If there is a clearly defined reason why a test case fails, I see two ways to resolve this:

Any thoughts on your end?

ntk73 commented 1 year ago

In order to get these tests running, somehow the RAuxInfo should be generated. Some manual work would be needed, perhaps. For example: (1) draw the molecules and generate the 2D coordinates; (2) Store the molecules as MDL files; (3) Make a RXN file with 2D coordinates for the reaction components; (4) Convert the RXN to a couple: (RInChI, RAuxInfo) and then finally and hopefully the tests should run.

uli-f commented 1 year ago

Thank you for listing the necessary steps.

I'll have a look on Friday and see if I can do this for one or two of them. If this fixes the issue we might just drop the other few test cases.

ntk73 commented 1 year ago

Few additional tips that might be useful in a practical aspect (I applied these for preparing some of the extra test units)

  1. I have used OpenBabel to generate MDL MOL file with 2D coordinates from a SMILES string with stereo information;
  2. The separate MDL MOL file text fragments were manually combined to make a single RXN file (the RXN header was made manually as well).

Also, we could use OpenBabel to convert separate InChIs (from the RInChI) to SMILES and eventually to add manually stereo in the SMILES if it is lost during the conversion

If you need some assistance in this regards I could help.

uli-f commented 1 year ago

Okay, I did this for a single test and it clearly shows that the addition of the Rauxinfo with 2D coordinates solves the issue. Would you mind merging PR #20.

I would consider this important information for the user of the RInChI functionality in CDK. The best I came up with where to put this was the package.html in the package org.openscience.cdk.rinchi.

I left you a TODO there with regard to the potential loss of stereochemical information. I'd appreciate it if you could amend any documentation there that might be useful in this context.

ntk73 commented 1 year ago

PR #20 is merged.

uli-f commented 1 year ago

I added some documentation for this to the package.html, PR #23.

I'd appreciate it if you could carry out any corrections of the documentation in package.html re the stereochemical information and add if there is something missing (tetrahedral stereochemical information?).

Once this is done we can close this issue.

ntk73 commented 1 year ago

Meanwhile, I have also added some html documentation :-). It will be needed to resolve some conflicts and then merge your commit from PR #23.

uli-f commented 1 year ago

Meanwhile, I have also added some html documentation :-). It will be needed to resolve some conflicts and then merge your commit from PR #23.

I am sorry, I was on a documentation-writing-roll yesterday! 🏃🏼

I hope it is not too much work to get this merged. There can never be too much documentation! 😄

uli-f commented 1 year ago

Given the merge of PR #23 I am happy to close this one.