epam / Indigo

Universal cheminformatics toolkit, utilities and database search tools
http://lifescience.opensource.epam.com
Apache License 2.0
295 stars 100 forks source link

Molecule Structure Is Different in Smart Layout Since Fix of Ketcher issues 486 and 487 #1514

Open gdfsk2 opened 7 months ago

gdfsk2 commented 7 months ago

Environment details (please complete the following information):

Describe the bug I'm a C++ developer, we include "Indigo.dll" in our software to convert SMILES as molecule structure image, and recently want to update it from 1.2.3 to 1.14.0. We found the result image for some long molecules are different from before, and it seems the old display is better? image

Steps to Reproduce

  1. Use API indigoLayout() to convert "c1ccccccccc1", smart layout mode.

Expected behavior image

Actual behavior image

Attachments

Indigo/Bingo version
1.14.0

Additional context This issue might from Indigo 1.4.0-beta (I cannot get 1.3.0 to confirm) Compare source code with 1.2.3, the reason should be:

  1. In MoleculeLayoutGraphSmart::_assignEveryCycle(), search "_segment_smoothing_prepearing", we can find it's comment out and the reason is:

    /*
     * A patrial fix for Ketcher issues 486 and 487
     *
     * The _segment_smoothing_prepearing() contains an error which leads to
     * a crash when cycles with double bonds are being processed. Currently,
     * we skip smoothing altogether. This seems not to create any regressions
     */
    
    ...
    
    segment.clear();
    rotation_point.zerofill();
    rotation_vertex.zerofill();
    
    //   _segment_smoothing_prepearing(cycle, rotation_vertex, rotation_point, segment, layout);
  2. This caused MoleculeLayoutMacrocyclesLattice::setTargetAngle() did not executed, and MoleculeLayoutMacrocyclesLattice::_target_angle[] was all 0 as default.
  3. Then in MoleculeLayoutMacrocyclesLattice::doLayout(), preliminary_layout() will always get "inf" value as result since _target_angle[] is a denominator in the calculation. double best_rating = preliminary_layout(cl);
  4. So 100 times of best_rating comparing with current_rating after that was failed, they are always both "inf".

Call stack:

indigo.dll!indigo::MoleculeLayoutMacrocyclesLattice::preliminary_layout(indigo::MoleculeLayoutMacrocyclesLattice::CycleLayout & cl) indigo.dll!indigo::MoleculeLayoutMacrocyclesLattice::doLayout() indigo.dll!indigo::MoleculeLayoutGraphSmart::_assignEveryCycle(const indigo::MoleculeLayoutGraph::Cycle & cycle) indigo.dll!indigo::MoleculeLayoutGraphSmart::_assignRelativeCoordinates(int & fixed_component, const indigo::MoleculeLayoutGraph & supergraph) indigo.dll!indigo::MoleculeLayoutGraph::_assignComponentsRelativeCoordinates(indigo::PtrArray & bc_components, indigo::Array & fixed_components, indigo::BiconnectedDecomposer & bc_decom) indigo.dll!indigo::MoleculeLayoutGraphSmart::_assignAbsoluteCoordinates(float bond_length) indigo.dll!indigo::MoleculeLayoutGraphSmart::_layoutSingleComponent(indigo::BaseMolecule & molecule, bool respect_existing, const indigo::Filter filter, float bond_length) indigo.dll!indigo::MoleculeLayoutGraphSmart::layout(indigo::BaseMolecule & molecule, float bond_length, const indigo::Filter filter, bool respect_existing) indigo.dll!indigo::MoleculeLayout::_makeLayout() indigo.dll!indigo::MoleculeLayout::_make() indigo.dll!indigo::MoleculeLayout::make() indigo.dll!indigoLayout(int object)

Should Ketcher issues 486 and 487 be refixed in other way? Or should MoleculeLayoutMacrocyclesLattice::_target_angle[] be initialized as other values then 0? Or do you think current looking is good, no need to take care of it?

gdfsk2 commented 7 months ago

Here are some SMILES for testing: CS(=O)(=O)c1ccc(cc1)c2cc(CO)nn2C3CCS(=O)(=O)CC3 Cc1ccc(cc1)c2cc(nn2c3ccc(cc3)S(=O)(=O)N)C(F)(F)F CC\1CC(OC(=O)CC(NC(=O)C(N(C(=O)C(NC(=O)C(C/C(=C1)/C)C)C)C)CC2=C(NC3=CC=CC=C32)Br)C4=CC=C(C=C4)O)C InChI=1S/C66H75Cl2N9O24/c1-23(2)12-34(71-5)58(88)76-49-51(83)26-7-10-38(32(67)14-26)97-40-16-28-17-41(55(40)101-65-56(54(86)53(85)42(22-78)99-65)100-44-21-66(4,70)57(87)24(3)96-44)98-39-11-8-27(15-33(39)68)52(84)50-63(93)75-48(64(94)95)31-18-29(79)19-37(81)45(31)30-13-25(6-9-36(30)80)46(60(90)77-50)74-61(91)47(28)73-59(89)35(20-43(69)82)72-62(49)92/h6-11,13-19,23-24,34-35,42,44,46-54,56-57,65,71,78-81,83-87H,12,20-22,70H2,1-5H3,(H2,69,82)(H,72,92)(H,73,89)(H,74,91)(H,75,93)(H,76,88)(H,77,90)(H,94,95)/t24-,34+,35-,42+,44-,46+,47+,48-,49+,50-,51+,52+,53+,54-,56+,57+,65-,66-/m0/s1 COC1=CC2=C(OC)C=C1C#CC1=CC(C)=CC(=C1)C#CC#CC1=CC(=CC(C)=C1)C#CC1=CC(OC)=C(C=C1OC)C#CC1=CC(=CC=C1)C#CC1=C(OC)C=C(C#CC3=CC(C)=CC(=C3)C#CC#CC3=CC(=CC(C)=C3)C#CC3=CC(OC)=C(C=C3OC)C#CC3=CC=CC(=C3)C#C2)C(OC)=C1 CC1=CC2=C(O)C(C[NH2+]CCN3CC[NH2+]CCN(CC[NH2+]CC4=CC(C)=CC(C[NH2+]CCN5CC[NH2+]CCN(CC[NH2+]C2)CC5)=C4O)CC3)=C1 CCC@HC1NC(=O)C2CCCN2C(=O)C2CCCN2C(=O)C(NC(=O)C@HNC(=O)C@HNC(=O)C(NC(=O)C2CSCSCC(NC1=O)C(=O)NC@@HC(=O)N1CCCC1C(=O)NC@@HC(=O)NCC(=O)NC@@HC(=O)N2)C@@HO)C@@HCC