qcscine / molassembler

Chemoinformatics toolkit with support for inorganic molecules
https://scine.ethz.ch/download/molassembler
BSD 3-Clause "New" or "Revised" License
33 stars 7 forks source link

Isomorphism check failure #10

Closed ViktoriiaBaib closed 1 year ago

ViktoriiaBaib commented 1 year ago

Hello! Could you explain why two graphs are drawn as the same structures but are considered different during the isomorphism check?

I generate graphs, canonicalize them and want to filter out isomorphic structures generated at different iterations. But I noticed that isomorphism check does not always remove the same structures. Thus, in this example, the structure from the first iteration:

Screen Shot 2023-02-06 at 7 41 05 PM

looks exactly like the structure from the second iteration of my algorihtm:

Screen Shot 2023-02-06 at 7 42 43 PM

But the isomorphism check says that they are different:

Screen Shot 2023-02-06 at 7 43 18 PM

Could you help with this issue? Is there something else to consider?

I attach str.zip with two pickle files for lists listone and listtwo with 6 molassembler Molecules in each; they all repeat each other but are not filtered out. To open them, you can do:

import pickle
with open("listone.pkl", "rb") as file:
  listone = pickle.load(file)
with open("listtwo.pkl", "rb") as file:
  listtwo = pickle.load(file)

Then you can try: listone[0] == listtwo[0]

moritzBens commented 1 year ago

Hi Viktoriia, the molecules differ in the local geometry at O15 (at least this is the index I see when loading your pickle files) . In the figures above this should be O12. One is assigned a linear geometry while the other is considered bent. You can check the local geometries of the atoms by hovering with your mouse over them in the figure in your notebook. If you are certain that these molecules should be the same, you should make sure that you tightly optimized the geometries. I would guess that one structure is not a minimum on the potential energy surface.

molassembler_molecules

Best, Moritz

ViktoriiaBaib commented 1 year ago

Hi Moritz,

Thank you for pointing that out! Understanding this helps a lot. However, I did not do geometry optimizations. I was only breaking and forming bonds within molassembler.

I have read two molassembler molecules from an optimized XYZ structures: one was a MOE ligand and another was Bi with other ligands. In MOE ligand, the Oxygen you talked about [O8] is bent:

Screen Shot 2023-02-09 at 2 34 17 PM

Then, within Bi-structure, I broke off 2 bonds and connected MOE ligand to it. I saved such structures in listone. So, in listone, this Oxygen [O15] is still bent:

Screen Shot 2023-02-09 at 2 41 05 PM

Next, I created a bond between Bi and that Oxygen [O24]. It became triangle:

Screen Shot 2023-02-09 at 2 43 33 PM

Then I broke the bond I just formed, and the Oxygen [O15] became line:

Screen Shot 2023-02-09 at 2 45 28 PM
  1. I performed two cancelling operations and my output differs from the input. Why is that correct?
  2. As far as I know from chemistry, when Oxygen is between 2 atoms with single bonds, it can only be bent and never linear. Why does it become line at all?
  3. Is it only for Oxygen? Or is it the default for every atom after bond breaking?
  4. Can I set this property for my molecule manually?

P.S. When I broke off the water, Oxygen there also became line.

Screen Shot 2023-02-09 at 3 03 58 PM
moritzBens commented 1 year ago

At first glance, this should not be the intended behavior. I can reproduce it and will have a look.

Thanks for reporting this issue!

moritzBens commented 1 year ago

I had a closer look at these odd shapes. By default, Molassembler tries to derive the new shapes such that they do not distort the remaining angles between bonds on the given atom too much. If this fails, it defaults to the most symmetric shape with the given number of vertices. For the transition from a triangle down, this is, of course, a line. Luckily, you can change this behavior to ignore the angle check and always perform transitions between shapes if they are unique. A triangle should always become bent if you remove one vertex. The required option is:

import scine_molassembler as masm
masm.Options.chiral_state_preservation = masm.ChiralStatePreservation.Unique

You can have a look on the different options by typing

help(masm.ChiralStatePreservation)

An alternative would be to change the angle threshold in the code: https://github.com/qcscine/molassembler/blob/5677b8b7acf929d90afa3422a2c8d114ace35fed/src/Molassembler/Stereopermutators/AtomStereopermutatorImpl.cpp#L79 For an idealized triangle, you could change it to 0.3, and it should work.

ViktoriiaBaib commented 1 year ago

Thanks a lot!

Setting Options.chiral_state_preservation = ChiralStatePreservation.Unique resolved my issue.