drugdata / D3R

Drug Design Data Resource is a suite of software to enable filtering, docking, and scoring of new sequences from wwpdb.
Other
23 stars 10 forks source link

In chimera_proteinligprep.py - Babel does not correctly generate 3D stereochemistry given a valid SMILES string #51

Closed j-wags closed 7 years ago

j-wags commented 8 years ago

Given CC(=O)[C@@]1(O)CC[C@H]2[C@@H]3CCC4=CC(=O)CC[C@]4(C)[C@H]3CC[C@]12C, the babel-generated 3d molecules differ at two stereocenters from the Schrodinger Ligprep-generated molecule.

j-wags commented 8 years ago

Investigation notes: Running identical babel 3d-generation commands 10 times on this input SMILES generates stereochemically different mol2's. Taking these babel-generated mol2's (with visibly different stereochemistries, even though they were all generated from "lig_new_smiles.smi") and using babel to convert them to SMILES yields different stereochemistry markers.

Babel knows that it's messing up:

j5wagner@jaws$ babel -ismi lig_new_smiles.smi -omol2 xx.mol2 --gen3D
==============================
*** Open Babel Warning  in CorrectStereoAtoms
  Could not correct 2 stereocenter(s) in this molecule ()
  with Atom Ids as follows: 7 9
1 molecule converted
1 warnings 353 audit log messages 1 debugging messages 

Taking a maestro-generated mol2 (ligprepped from "lig_new_smiles.smi", in which the 3d structure has proper stereochemistry) and converting it to smiles using babel yields the same smiles as we had originally. Therefore babel is capable of working with a molecule with the desired stereochemistry.

Letting maestro generate many different stereochemistries for undefined centers during prep only yields one molecule. This means that maestro can read the SMILES string without any ambiguity. Thus I think that all stereocenters are defined/properly formatted in the "lig_new_smiles.smi" file

Having babel convert the smiles to an INCHI and then to mol2 doesn't help

Having babel convert the smiles to a 2D mol2 (which, I note, does not contain any info about chirality) and then to a 3D mol2 doesn't help

Therefore, I'm concluding that babel knows it's messing up, and can recognize when the proper stereochemistry is in a given structure, but is incapable of generating 3D coordinates with the proper stereochemistry. For that reason I've made a RDKit coordinate generation script and am attempting to replace babel with that.

j-wags commented 7 years ago

Converted final generated ligands from RDKit/Chimera prep and Maestro Ligprep to SMILES, then compared to the original pdb pre-release SMILES. I verified that this conversion to SMILES yields the same result whether I use babel or schrodinger structconvert. Since chimera and ligprep can choose to charge/not charge resulting ligands, I removed charge using an RDKit SMARTS replacement pattern. Finally, I converted all SMILES to canonical/isomeric form and compared them to the original (from the pdb pre-release). I've found that:

5a5d - both fail - no cis/trans markers in original smiles 5c5a - schrodinger workflow disagrees at 1 chiral center 4ym9 - both fail - no cis/trans markers in original smiles 5h9r - schrodinger workflow disagrees at many, but not all, chiral centers 5bzo - both fail - no cis/trans markers in original smiles 5h9p - schrodinger workflow disagrees at many, but not all, chiral centers 5kix - schrodinger workflow disagrees at many, but not all, chiral centers 5f9b - both fail - molecule with 10 chiral centers and numerous ring closures 4zk6 - chimera/rdkit workflow makes a hydroxyl radical (?) and crashes my analysis script 1fcz - both fail - no cis/trans markers in original smiles

So basically there are 6 real failed cases - 1 both, 1 just chimera/rdkit, and 4 just schrodinger

The next inquiry on this ticket should look into...

=============================
SUMMARY
=============================
CHIMERA MATCHES: 24
MAESTRO MATCHES: 20
MOLECULES ATTEMPTED: 29

Full output follows. Generated on nif1 by: /var/home/j5wagner/2016_06_30_bakerLabOHPTest/week26Test/stage.5.chimeraproteinligprep/compareToMaestro.py

5a5d

CHI: c1ccc(O)c(O)c1/C(O)=N/CCCCC\N=C(O)\c2c(O)c(O)ccc2 lig_5LC_unprep_step2.mol2
MAE: c1ccc(O)c(O)c1/C(O)=N\CCCCC/N=C(O)\c2c(O)c(O)ccc2

UNCHARGED
CHI: c1ccc(O)c(O)c1/C(O)=N/CCCCC\N=C(O)\c2c(O)c(O)ccc2
MAE: c1ccc(O)c(O)c1/C(O)=N\CCCCC/N=C(O)\c2c(O)c(O)ccc2

UNCHARGED CANONICAL/ISOMERIC
CHI: O/C(=N\CCCCC/N=C(\O)c1cccc(O)c1O)c1cccc(O)c1O
MAE: O/C(=N/CCCCC/N=C(/O)c1cccc(O)c1O)c1cccc(O)c1O

ORG: OC1=CC=CC(C(O)=NCCCCCN=C(O)C2=C(O)C(O)=CC=C2)=C1O
CAN: OC(=NCCCCCN=C(O)c1cccc(O)c1O)c1cccc(O)c1O

CHIMERA DIFFERS
MAESTRO DIFFERS
==============================

5c5a

CHI: OC1=NCCN(C1)C(=O)N2[C@H](c3ccc(Cl)cc3)[C@H](c4ccc(Cl)cc4)N=C2c5c(OC(C)C)cc(cc5)OC lig_NUT_unprep_step2.mol2
MAE: OC1=NCCN(C1)C(=O)N2[C@H](c3ccc(Cl)cc3)[C@@H](c4ccc(Cl)cc4)N=C2c5c(OC(C)C)cc(cc5)OC

UNCHARGED
CHI: OC1=NCCN(C1)C(=O)N2[C@H](c3ccc(Cl)cc3)[C@H](c4ccc(Cl)cc4)N=C2c5c(OC(C)C)cc(cc5)OC
MAE: OC1=NCCN(C1)C(=O)N2[C@H](c3ccc(Cl)cc3)[C@@H](c4ccc(Cl)cc4)N=C2c5c(OC(C)C)cc(cc5)OC

UNCHARGED CANONICAL/ISOMERIC
CHI: COc1ccc(C2=N[C@@H](c3ccc(Cl)cc3)[C@@H](c3ccc(Cl)cc3)N2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1
MAE: COc1ccc(C2=N[C@H](c3ccc(Cl)cc3)[C@@H](c3ccc(Cl)cc3)N2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1

ORG: [H][C@@]1(C2=CC=C(Cl)C=C2)N=C(C2=C(OC(C)C)C=C(OC)C=C2)N(C(=O)N2CCN=C(O)C2)[C@]1([H])C1=CC=C(Cl)C=C1
CAN: COc1ccc(C2=N[C@@H](c3ccc(Cl)cc3)[C@@H](c3ccc(Cl)cc3)N2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1

CHIMERA MATCHES
MAESTRO DIFFERS
==============================

4ym9

CHI: CCC(CC)(CO)/C(O)=N/c(cn1)ccc1C lig_4E4_unprep_step2.mol2
MAE: CCC(CC)(CO)/C(O)=N\c(cn1)ccc1C

UNCHARGED
CHI: CCC(CC)(CO)/C(O)=N/c(cn1)ccc1C
MAE: CCC(CC)(CO)/C(O)=N\c(cn1)ccc1C

UNCHARGED CANONICAL/ISOMERIC
CHI: CCC(CC)(CO)/C(O)=N/c1ccc(C)nc1
MAE: CCC(CC)(CO)/C(O)=N\c1ccc(C)nc1

ORG: CCC(CC)(CO)C(O)=NC1=CN=C(C)C=C1
CAN: CCC(CC)(CO)C(O)=Nc1ccc(C)nc1

CHIMERA DIFFERS
MAESTRO DIFFERS
==============================

5h9r

CHI: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@@H](O)[C@@H](O)[C@H]4CO lig_TGZ_unprep_step2.mol2
MAE: c1ccc(F)cc1-c2cn(nn2)[C@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@H](O3)S[C@@H](O4)[C@H](O)[C@H](O)[C@@H](O)[C@H]4CO

UNCHARGED
CHI: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@@H](O)[C@@H](O)[C@H]4CO
MAE: c1ccc(F)cc1-c2cn(nn2)[C@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@H](O3)S[C@@H](O4)[C@H](O)[C@H](O)[C@@H](O)[C@H]4CO

UNCHARGED CANONICAL/ISOMERIC
CHI: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](O)[C@H]1O
MAE: OC[C@H]1O[C@H](S[C@H]2O[C@H](CO)[C@H](O)[C@@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@H](O)[C@H]1O

ORG: [H][C@@]1(S[C@]2([H])O[C@]([H])(CO)[C@]([H])(O)[C@]([H])(N3C=C(C4=CC(F)=CC=C4)N=N3)[C@@]2([H])O)O[C@]([H])(CO)[C@]([H])(O)[C@]([H])(O)[C@@]1([H])O
CAN: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](O)[C@H]1O

CHIMERA MATCHES
MAESTRO DIFFERS
==============================

5bzl

CHI: COC(=O)CC[N@H+](C1)CCc(c12)cccc2N lig_4WO_unprep_step2.mol2
MAE: COC(=O)CCN(C1)CCc(c12)cccc2N

UNCHARGED
CHI: COC(=O)CCN1CCc2cccc(N)c2C1
MAE: COC(=O)CCN(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: COC(=O)CCN1CCc2cccc(N)c2C1
MAE: COC(=O)CCN1CCc2cccc(N)c2C1

ORG: COC(=O)CCN1CCC2=C(C1)C(N)=CC=C2
CAN: COC(=O)CCN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzi

CHI: C1[NH2+]CCc(c12)cccc2N lig_4WU_unprep_step2.mol2
MAE: C1NCCc(c12)cccc2N

UNCHARGED
CHI: Nc1cccc2c1CNCC2
MAE: C1NCCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1cccc2c1CNCC2
MAE: Nc1cccc2c1CNCC2

ORG: NC1=CC=CC2=C1CNCC2
CAN: Nc1cccc2c1CNCC2

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzk

CHI: COC(=O)CC[N@H+](CC1)Cc(c12)cccc2 lig_4WP_unprep_step2.mol2
MAE: COC(=O)CCN(CC1)Cc(c12)cccc2

UNCHARGED
CHI: COC(=O)CCN1CCc2ccccc2C1
MAE: COC(=O)CCN(CC1)Cc(c12)cccc2

UNCHARGED CANONICAL/ISOMERIC
CHI: COC(=O)CCN1CCc2ccccc2C1
MAE: COC(=O)CCN1CCc2ccccc2C1

ORG: COC(=O)CCN1CCC2=CC=CC=C2C1
CAN: COC(=O)CCN1CCc2ccccc2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzt

CHI: COCC(COC)[N@H+](C1)CCc(c12)cccc2N lig_4XJ_unprep_step2.mol2
MAE: COCC(COC)N(C1)CCc(c12)cccc2N

UNCHARGED
CHI: COCC(COC)N1CCc2cccc(N)c2C1
MAE: COCC(COC)N(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: COCC(COC)N1CCc2cccc(N)c2C1
MAE: COCC(COC)N1CCc2cccc(N)c2C1

ORG: COCC(COC)N1CCC2=C(C1)C(N)=CC=C2
CAN: COCC(COC)N1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5hvt

CHI: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 lig_NVS_unprep_step2.mol2
MAE: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3

UNCHARGED
CHI: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3
MAE: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3

UNCHARGED CANONICAL/ISOMERIC
CHI: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1
MAE: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1

ORG: COC1=CC=C(N2CC3=C(C=C(O)C=C3)OC2=O)C=C1
CAN: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzo

CHI: C\N=C(O)\CC[N@@H+](C1)CCc(c12)cccc2N lig_4XC_unprep_step2.mol2
MAE: C/N=C(O)\CCN(C1)CCc(c12)cccc2N

UNCHARGED
CHI: C/N=C(\O)CCN1CCc2cccc(N)c2C1
MAE: C/N=C(O)\CCN(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: C/N=C(\O)CCN1CCc2cccc(N)c2C1
MAE: C/N=C(/O)CCN1CCc2cccc(N)c2C1

ORG: CN=C(O)CCN1CCC2=C(C1)C(N)=CC=C2
CAN: CN=C(O)CCN1CCc2cccc(N)c2C1

CHIMERA DIFFERS
MAESTRO DIFFERS
==============================

5bzs

CHI: C1COCCC1OCCC[N@@H+](CC2)Cc(c23)cccc3N lig_4XK_unprep_step2.mol2
MAE: C1COCCC1OCCCN(CC2)Cc(c23)cccc3N

UNCHARGED
CHI: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2
MAE: C1COCCC1OCCCN(CC2)Cc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2
MAE: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2

ORG: NC1=CC=CC2=C1CCN(CCCOC1CCOCC1)C2
CAN: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5h9p

CHI: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@H]([C@@H](O)[C@H]4CO)n(nn5)cc5-c6cc(F)ccc6 lig_TD2_unprep_step2.mol2
MAE: c1ccc(F)cc1-c2cn(nn2)[C@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@H](O3)S[C@@H](O4)[C@H](O)[C@@H]([C@@H](O)[C@H]4CO)n(nn5)cc5-c6cc(F)ccc6

UNCHARGED
CHI: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@H]([C@@H](O)[C@H]4CO)n(nn5)cc5-c6cc(F)ccc6
MAE: c1ccc(F)cc1-c2cn(nn2)[C@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@H](O3)S[C@@H](O4)[C@H](O)[C@@H]([C@@H](O)[C@H]4CO)n(nn5)cc5-c6cc(F)ccc6

UNCHARGED CANONICAL/ISOMERIC
CHI: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](n2cc(-c3cccc(F)c3)nn2)[C@H]1O
MAE: OC[C@H]1O[C@H](S[C@H]2O[C@H](CO)[C@H](O)[C@@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@H](n2cc(-c3cccc(F)c3)nn2)[C@H]1O

ORG: [H][C@@]1(S[C@]2([H])O[C@]([H])(CO)[C@]([H])(O)[C@]([H])(N3C=C(C4=CC(F)=CC=C4)N=N3)[C@@]2([H])O)O[C@]([H])(CO)[C@]([H])(O)[C@]([H])(N2C=C(C3=CC(F)=CC=C3)N=N2)[C@@]1([H])O
CAN: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](n2cc(-c3cccc(F)c3)nn2)[C@H]1O

CHIMERA MATCHES
MAESTRO DIFFERS
==============================

5ii2

CHI: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 lig_LU2_unprep_step2.mol2
MAE: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3

UNCHARGED
CHI: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3
MAE: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3

UNCHARGED CANONICAL/ISOMERIC
CHI: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12
MAE: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12

ORG: O=C1C=C(C2=CC(O)=C(O)C=C2)OC2=CC(O)=CC(O)=C12
CAN: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5kcv

CHI: C1CCC1([NH3+])c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6 lig_6S1_unprep_step2.mol2
MAE: C1CCC1(N)c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6

UNCHARGED
CHI: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1
MAE: C1CCC1(N)c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1
MAE: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1

ORG: NC1=C(C2=NC3=C(N=C(C4=CC=CC=C4)C=C3)N2C2=CC=C(C3(N)CCC3)C=C2)C=CC=N1
CAN: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzm

CHI: Nc1ccnc(c1C)C[N@H+](C2)CCc(c23)cccc3N lig_4X8_unprep_step2.mol2
MAE: Nc1ccnc(c1C)CN(C2)CCc(c23)cccc3N

UNCHARGED
CHI: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1
MAE: Nc1ccnc(c1C)CN(C2)CCc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1
MAE: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1

ORG: CC1=C(N)C=CN=C1CN1CCC2=C(C1)C(N)=CC=C2
CAN: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzp

CHI: C[NH+](C)CCC[N@@H+](C1)CCc(c12)cccc2N lig_4XG_unprep_step2.mol2
MAE: CN(C)CCCN(C1)CCc(c12)cccc2N

UNCHARGED
CHI: CN(C)CCCN1CCc2cccc(N)c2C1
MAE: CN(C)CCCN(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: CN(C)CCCN1CCc2cccc(N)c2C1
MAE: CN(C)CCCN1CCc2cccc(N)c2C1

ORG: CN(C)CCCN1CCC2=C(C1)C(N)=CC=C2
CAN: CN(C)CCCN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5kix

CHI: O=P(O)(O)OC[C@@H]1[C@@H](OP(=O)(O)O)C[C@@H](O1)n(c2)c(=O)nc(O)c2C lig_THP_unprep_step2.mol2
MAE: O=P(O)(O)OC[C@@H]1[C@H](OP(=O)(O)O)C[C@@H](O1)n(c2)c(=O)nc(O)c2C

UNCHARGED
CHI: O=P(O)(O)OC[C@@H]1[C@@H](OP(=O)(O)O)C[C@@H](O1)n(c2)c(=O)nc(O)c2C
MAE: O=P(O)(O)OC[C@@H]1[C@H](OP(=O)(O)O)C[C@@H](O1)n(c2)c(=O)nc(O)c2C

UNCHARGED CANONICAL/ISOMERIC
CHI: Cc1cn([C@H]2C[C@H](OP(=O)(O)O)[C@@H](COP(=O)(O)O)O2)c(=O)nc1O
MAE: Cc1cn([C@H]2C[C@@H](OP(=O)(O)O)[C@@H](COP(=O)(O)O)O2)c(=O)nc1O

ORG: [H][C@]1(OP(=O)(O)O)C[C@]([H])(N2C=C(C)C(O)=NC2=O)O[C@]1([H])COP(=O)(O)O
CAN: Cc1cn([C@H]2C[C@H](OP(=O)(O)O)[C@@H](COP(=O)(O)O)O2)c(=O)nc1O

CHIMERA MATCHES
MAESTRO DIFFERS
==============================

5bzr

CHI: C1COCCC1OCCC[N@H+](C2)CCc(c23)cccc3N lig_4XM_unprep_step2.mol2
MAE: C1COCCC1OCCCN(C2)CCc(c23)cccc3N

UNCHARGED
CHI: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2
MAE: C1COCCC1OCCCN(C2)CCc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2
MAE: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2

ORG: NC1=CC=CC2=C1CN(CCCOC1CCOCC1)CC2
CAN: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzf

CHI: CC[N@@H+](CC1)Cc(c12)cccc2N lig_4X3_unprep_step2.mol2
MAE: CCN(CC1)Cc(c12)cccc2N

UNCHARGED
CHI: CCN1CCc2c(N)cccc2C1
MAE: CCN(CC1)Cc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: CCN1CCc2c(N)cccc2C1
MAE: CCN1CCc2c(N)cccc2C1

ORG: CCN1CCC2=C(C=CC=C2N)C1
CAN: CCN1CCc2c(N)cccc2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzj

CHI: Cn1ccnc1CC[N@@H+](C2)CCc(c23)cccc3N lig_4WN_unprep_step2.mol2
MAE: Cn1ccnc1CCN(C2)CCc(c23)cccc3N

UNCHARGED
CHI: Cn1ccnc1CCN1CCc2cccc(N)c2C1
MAE: Cn1ccnc1CCN(C2)CCc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Cn1ccnc1CCN1CCc2cccc(N)c2C1
MAE: Cn1ccnc1CCN1CCc2cccc(N)c2C1

ORG: CN1C=CN=C1CCN1CCC2=C(C1)C(N)=CC=C2
CAN: Cn1ccnc1CCN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzh

CHI: [NH3+]CCC[N@@H+](CC1)Cc(c12)cccc2 lig_4X1_unprep_step2.mol2
MAE: NCCCN(CC1)Cc(c12)cccc2

UNCHARGED
CHI: NCCCN1CCc2ccccc2C1
MAE: NCCCN(CC1)Cc(c12)cccc2

UNCHARGED CANONICAL/ISOMERIC
CHI: NCCCN1CCc2ccccc2C1
MAE: NCCCN1CCc2ccccc2C1

ORG: NCCCN1CCC2=CC=CC=C2C1
CAN: NCCCN1CCc2ccccc2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzc

CHI: CC1(C)C[NH2+]Cc(c12)cccc2 lig_4WT_unprep_step2.mol2
MAE: CC1(C)CNCc(c12)cccc2

UNCHARGED
CHI: CC1(C)CNCc2ccccc21
MAE: CC1(C)CNCc(c12)cccc2

UNCHARGED CANONICAL/ISOMERIC
CHI: CC1(C)CNCc2ccccc21
MAE: CC1(C)CNCc2ccccc21

ORG: CC1(C)CNCC2=CC=CC=C21
CAN: CC1(C)CNCc2ccccc21

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5f9b

CHI: C1CC(C)(C)C[C@H]([C@]12C(=O)O)C=3[C@@](C)(C[C@H]2O)[C@]4(C)[C@H](CC3)[C@]5(C)[C@@H](CC4)[C@](C)(CO)[C@@H](O)CC5 lig_5VN_unprep_step2.mol2
MAE: C1CC(C)(C)C[C@@H]([C@]12C(=O)O)C=3[C@](C)(C[C@H]2O)[C@@]4(C)[C@H](CC3)[C@]5(C)[C@@H](CC4)[C@](C)(CO)[C@H](O)CC5

UNCHARGED
CHI: C1CC(C)(C)C[C@H]([C@]12C(=O)O)C=3[C@@](C)(C[C@H]2O)[C@]4(C)[C@H](CC3)[C@]5(C)[C@@H](CC4)[C@](C)(CO)[C@@H](O)CC5
MAE: C1CC(C)(C)C[C@@H]([C@]12C(=O)O)C=3[C@](C)(C[C@H]2O)[C@@]4(C)[C@H](CC3)[C@]5(C)[C@@H](CC4)[C@](C)(CO)[C@H](O)CC5

UNCHARGED CANONICAL/ISOMERIC
CHI: CC1(C)CC[C@@]2(C(=O)O)[C@@H](C1)C1=CC[C@H]3[C@](C)(CC[C@H]4[C@](C)(CO)[C@@H](O)CC[C@]34C)[C@]1(C)C[C@H]2O
MAE: CC1(C)CC[C@]2(C(=O)O)[C@H](O)C[C@@]3(C)C(=CC[C@@H]4[C@@]5(C)CC[C@@H](O)[C@@](C)(CO)[C@@H]5CC[C@]43C)[C@H]2C1

ORG: [H][C@]1(O)CC[C@@]2(C)[C@@]([H])(CC[C@@]3(C)[C@]4(C)C[C@@]([H])(O)[C@@]5(C(=O)O)CCC(C)(C)C[C@@]5([H])C4=CC[C@@]32[H])[C@]1(C)CO
CAN: CC1(C)CC[C@@]2(C(=O)O)[C@@H](C1)C1=CC[C@@H]3[C@@]4(C)CC[C@H](O)[C@@](C)(CO)[C@@H]4CC[C@@]3(C)[C@]1(C)C[C@H]2O

CHIMERA DIFFERS
MAESTRO DIFFERS
==============================

5iid

CHI: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 lig_6BK_unprep_step2.mol2
MAE: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3

UNCHARGED
CHI: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3
MAE: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3

UNCHARGED CANONICAL/ISOMERIC
CHI: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12
MAE: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12

ORG: O=C1C=C(C2=CC(O)=C(O)C=C2)OC2=CC=CC(O)=C12
CAN: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12

CHIMERA MATCHES
MAESTRO MATCHES
==============================

4zk6

CHI: O=C(O)c1c(C(=O)[OH-])nccc1 lig_NTM_unprep_step2.mol2
MAE: O=C(O)c1c(C(=O)O)nccc1

UNCHARGED
[23:35:43] Explicit valence for atom # 7 O, 3, is greater than permitted

ERROR. SKIPPING MOL

=============================
1fcz

CHI: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C lig_156_unprep_step2.mol2
MAE: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C

UNCHARGED
CHI: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C
MAE: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C

UNCHARGED CANONICAL/ISOMERIC
CHI: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21
MAE: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21

ORG: [H]/C(C(=O)C1=CC2=C(C=C1)C(C)(C)CCC2(C)C)=C(/[H])C1=CC=C(C(=O)O)C=C1
CAN: CC1(C)CCC(C)(C)c2cc(C(=O)C=Cc3ccc(C(=O)O)cc3)ccc21

CHIMERA DIFFERS
MAESTRO DIFFERS
==============================

5bzg

CHI: C1C[N@H+](C)Cc(c12)cccc2N lig_4X6_unprep_step2.mol2
MAE: C1CN(C)Cc(c12)cccc2N

UNCHARGED
CHI: CN1CCc2c(N)cccc2C1
MAE: C1CN(C)Cc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: CN1CCc2c(N)cccc2C1
MAE: CN1CCc2c(N)cccc2C1

ORG: CN1CCC2=C(C=CC=C2N)C1
CAN: CN1CCc2c(N)cccc2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzq

CHI: C1COCC[NH+]1CCC[N@H+](C2)CCc(c23)cccc3N lig_4XL_unprep_step2.mol2
MAE: C1COCCN1CCCN(C2)CCc(c23)cccc3N

UNCHARGED
CHI: Nc1cccc2c1CN(CCCN1CCOCC1)CC2
MAE: C1COCCN1CCCN(C2)CCc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1cccc2c1CN(CCCN1CCOCC1)CC2
MAE: Nc1cccc2c1CN(CCCN1CCOCC1)CC2

ORG: NC1=CC=CC2=C1CN(CCCN1CCOCC1)CC2
CAN: Nc1cccc2c1CN(CCCN1CCOCC1)CC2

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5ii1

CHI: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 lig_6BL_unprep_step2.mol2
MAE: Cc1[nH]nc(c12)oc(=O)c3c2cccc3

UNCHARGED
CHI: Cc1[nH]nc(c12)oc(=O)c3c2cccc3
MAE: Cc1[nH]nc(c12)oc(=O)c3c2cccc3

UNCHARGED CANONICAL/ISOMERIC
CHI: Cc1[nH]nc2oc(=O)c3ccccc3c12
MAE: Cc1[nH]nc2oc(=O)c3ccccc3c12

ORG: CC1=C2C3=CC=CC=C3C(=O)OC2=NN1
CAN: Cc1[nH]nc2oc(=O)c3ccccc3c12

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzn

CHI: COCC[N@H+](C1)CCc(c12)cccc2N lig_4XD_unprep_step2.mol2
MAE: COCCN(C1)CCc(c12)cccc2N

UNCHARGED
CHI: COCCN1CCc2cccc(N)c2C1
MAE: COCCN(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: COCCN1CCc2cccc(N)c2C1
MAE: COCCN1CCc2cccc(N)c2C1

ORG: COCCN1CCC2=C(C1)C(N)=CC=C2
CAN: COCCN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

=============================
SUMMARY
=============================
CHIMERA MATCHES: 24
MAESTRO MATCHES: 20
mkgilson commented 7 years ago

Hi Jeff,

Thanks for this analysis. It's concerning that neither program seems to be very faithful to the input structures. I have a few questions:

You say the pre-release SMILES lacks cis/trans markers; but aren't we starting with inchis, rather than SMILES?

Also, what is the purpose of generating these SMILES strings? My understanding is we need to go from inchi to 3D, where 3D could be sdfile, molfile, mol2, etc. I'm not sure why we need to go through smiles at all. Sorry for probably forgetting some key step!

If we do need smiles, it may be worth looking at chemaxon.

best Mike

On 7/13/2016 12:04 AM, j-wags wrote:

Converted final generated ligands from RDKit/Chimera prep and Maestro Ligprep to SMILES, then compared to the original pdb pre-release SMILES. I verified that this conversion to SMILES yields the same result whether I use babel and schrodinger structconvert. Since chimera and ligprep can choose to charge/not charge resulting ligands, I removed charge using an RDKit SMARTS replacement pattern. Finally, I converted all SMILES to canonical/isomeric form and compared them to the original (from the pdb pre-release). I've found that:

  • The pre-release SMILES string doesn't contain cis/trans isomer markers
  • The RDKit/Chimera tag team is pretty gung-ho about adding hydrogens to protonatable groups
  • The schrodinger proteinprep workflow has less exact SMILES matches than chimera. Both fail sometimes, though (eg. 5f9b). Failures are broken down as follows:

/5a5d - both fail - no cis/trans markers in original smiles/ 5c5a - schrodinger workflow disagrees at 1 chiral center /4ym9 - both fail - no cis/trans markers in original smiles/ 5h9r - schrodinger workflow disagrees at many, but not all, chiral centers /5bzo - both fail - no cis/trans markers in original smiles/ 5h9p - schrodinger workflow disagrees at many, but not all, chiral centers 5kix - schrodinger workflow disagrees at many, but not all, chiral centers 5f9b - both fail - molecule with 10 chiral centers and numerous ring closures 4zk6 - chimera/rdkit workflow makes a hydroxyl radical (?) and crashes my analysis script /1fcz - both fail - no cis/trans markers in original smiles/

So basically there are 6 bad failed cases - 2 chimera/rdkit and 1 schrodinger

SUMMARY

CHIMERA MATCHES: 24 MAESTRO MATCHES: 20 MOLECULES ATTEMPTED: 29

Full output follows. Generated on nif1 by: /var/home/j5wagner/2016_06_30_bakerLabOHPTest/week26Test/stage.5.chimeraproteinligprep/compareToMaestro.py

|5a5d CHI: c1ccc(O)c(O)c1/C(O)=N/CCCCC\N=C(O)\c2c(O)c(O)ccc2 lig_5LC_unprep_step2.mol2 MAE: c1ccc(O)c(O)c1/C(O)=N\CCCCC/N=C(O)\c2c(O)c(O)ccc2 UNCHARGED CHI: c1ccc(O)c(O)c1/C(O)=N/CCCCC\N=C(O)\c2c(O)c(O)ccc2 MAE: c1ccc(O)c(O)c1/C(O)=N\CCCCC/N=C(O)\c2c(O)c(O)ccc2 UNCHARGED CANONICAL/ISOMERIC CHI: O/C(=N\CCCCC/N=C(\O)c1cccc(O)c1O)c1cccc(O)c1O MAE: O/C(=N/CCCCC/N=C(/O)c1cccc(O)c1O)c1cccc(O)c1O ORG: OC1=CC=CC(C(O)=NCCCCCN=C(O)C2=C(O)C(O)=CC=C2)=C1O CAN: OC(=NCCCCCN=C(O)c1cccc(O)c1O)c1cccc(O)c1O CHIMERA DIFFERS MAESTRO DIFFERS ============================== 5c5a CHI: OC1=NCCN(C1)C(=O)N2C@HC@HN=C2c5c(OC(C)C)cc(cc5)OC lig_NUT_unprep_step2.mol2 MAE: OC1=NCCN(C1)C(=O)N2C@HC@@HN=C2c5c(OC(C)C)cc(cc5)OC UNCHARGED CHI: OC1=NCCN(C1)C(=O)N2C@HC@HN=C2c5c(OC(C)C)cc(cc5)OC MAE: OC1=NCCN(C1)C(=O)N2C@HC@@HN=C2c5c(OC(C)C)cc(cc5)OC UNCHARGED CANONICAL/ISOMERIC CHI: COc1ccc(C2=NC@@HC@@HN2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1 MAE: COc1ccc(C2=NC@HC@@HN2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1 ORG: [H][C@@]1(C2=CC=C(Cl)C=C2)N=C(C2=C(OC(C)C)C=C(OC)C=C2)N(C(=O)N2CCN=C(O)C2)[C@]1([H])C1=CC=C(Cl)C=C1 CAN: COc1ccc(C2=NC@@HC@@HN2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1 CHIMERA MATCHES MAESTRO DIFFERS ============================== 4ym9 CHI: CCC(CC)(CO)/C(O)=N/c(cn1)ccc1C lig_4E4_unprep_step2.mol2 MAE: CCC(CC)(CO)/C(O)=N\c(cn1)ccc1C UNCHARGED CHI: CCC(CC)(CO)/C(O)=N/c(cn1)ccc1C MAE: CCC(CC)(CO)/C(O)=N\c(cn1)ccc1C UNCHARGED CANONICAL/ISOMERIC CHI: CCC(CC)(CO)/C(O)=N/c1ccc(C)nc1 MAE: CCC(CC)(CO)/C(O)=N\c1ccc(C)nc1 ORG: CCC(CC)(CO)C(O)=NC1=CN=C(C)C=C1 CAN: CCC(CC)(CO)C(O)=Nc1ccc(C)nc1 CHIMERA DIFFERS MAESTRO DIFFERS ============================== 5h9r CHI: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@@HC@@H[C@H]4CO lig_TGZ_unprep_step2.mol2 MAE: c1ccc(F)cc1-c2cn(nn2)C@HC@@HC@HSC@@HC@HC@HC@@H[C@H]4CO UNCHARGED CHI: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@@HC@@H[C@H]4CO MAE: c1ccc(F)cc1-c2cn(nn2)C@HC@@HC@HSC@@HC@HC@HC@@H[C@H]4CO UNCHARGED CANONICAL/ISOMERIC CHI: OC[C@H]1OC@@HC@HC@@H[C@H]1O MAE: OC[C@H]1OC@HC@HC@H[C@H]1O ORG: [H][C@@]1(S[C@]2([H])OC@(CO)C@(O)C@(N3C=C(C4=CC(F)=CC=C4)N=N3)[C@@]2([H])O)OC@(CO)C@(O)C@(O)[C@@]1([H])O CAN: OC[C@H]1OC@@HC@HC@@H[C@H]1O CHIMERA MATCHES MAESTRO DIFFERS ============================== 5bzl CHI: COC(=O)CCN@H+CCc(c12)cccc2N lig_4WO_unprep_step2.mol2 MAE: COC(=O)CCN(C1)CCc(c12)cccc2N UNCHARGED CHI: COC(=O)CCN1CCc2cccc(N)c2C1 MAE: COC(=O)CCN(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: COC(=O)CCN1CCc2cccc(N)c2C1 MAE: COC(=O)CCN1CCc2cccc(N)c2C1 ORG: COC(=O)CCN1CCC2=C(C1)C(N)=CC=C2 CAN: COC(=O)CCN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzi CHI: C1[NH2+]CCc(c12)cccc2N lig_4WU_unprep_step2.mol2 MAE: C1NCCc(c12)cccc2N UNCHARGED CHI: Nc1cccc2c1CNCC2 MAE: C1NCCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: Nc1cccc2c1CNCC2 MAE: Nc1cccc2c1CNCC2 ORG: NC1=CC=CC2=C1CNCC2 CAN: Nc1cccc2c1CNCC2 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzk CHI: COC(=O)CCN@H+Cc(c12)cccc2 lig_4WP_unprep_step2.mol2 MAE: COC(=O)CCN(CC1)Cc(c12)cccc2 UNCHARGED CHI: COC(=O)CCN1CCc2ccccc2C1 MAE: COC(=O)CCN(CC1)Cc(c12)cccc2 UNCHARGED CANONICAL/ISOMERIC CHI: COC(=O)CCN1CCc2ccccc2C1 MAE: COC(=O)CCN1CCc2ccccc2C1 ORG: COC(=O)CCN1CCC2=CC=CC=C2C1 CAN: COC(=O)CCN1CCc2ccccc2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzt CHI: COCC(COC)N@H+CCc(c12)cccc2N lig_4XJ_unprep_step2.mol2 MAE: COCC(COC)N(C1)CCc(c12)cccc2N UNCHARGED CHI: COCC(COC)N1CCc2cccc(N)c2C1 MAE: COCC(COC)N(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: COCC(COC)N1CCc2cccc(N)c2C1 MAE: COCC(COC)N1CCc2cccc(N)c2C1 ORG: COCC(COC)N1CCC2=C(C1)C(N)=CC=C2 CAN: COCC(COC)N1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5hvt CHI: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 lig_NVS_unprep_step2.mol2 MAE: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 UNCHARGED CHI: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 MAE: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 UNCHARGED CANONICAL/ISOMERIC CHI: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1 MAE: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1 ORG: COC1=CC=C(N2CC3=C(C=C(O)C=C3)OC2=O)C=C1 CAN: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzo CHI: C\N=C(O)\CCN@@H+CCc(c12)cccc2N lig_4XC_unprep_step2.mol2 MAE: C/N=C(O)\CCN(C1)CCc(c12)cccc2N UNCHARGED CHI: C/N=C(\O)CCN1CCc2cccc(N)c2C1 MAE: C/N=C(O)\CCN(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: C/N=C(\O)CCN1CCc2cccc(N)c2C1 MAE: C/N=C(/O)CCN1CCc2cccc(N)c2C1 ORG: CN=C(O)CCN1CCC2=C(C1)C(N)=CC=C2 CAN: CN=C(O)CCN1CCc2cccc(N)c2C1 CHIMERA DIFFERS MAESTRO DIFFERS ============================== 5bzs CHI: C1COCCC1OCCCN@@H+Cc(c23)cccc3N lig_4XK_unprep_step2.mol2 MAE: C1COCCC1OCCCN(CC2)Cc(c23)cccc3N UNCHARGED CHI: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2 MAE: C1COCCC1OCCCN(CC2)Cc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2 MAE: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2 ORG: NC1=CC=CC2=C1CCN(CCCOC1CCOCC1)C2 CAN: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2 CHIMERA MATCHES MAESTRO MATCHES ============================== 5h9p CHI: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@Hn(nn5)cc5-c6cc(F)ccc6 lig_TD2_unprep_step2.mol2 MAE: c1ccc(F)cc1-c2cn(nn2)C@HC@@HC@HSC@@HC@HC@@Hn(nn5)cc5-c6cc(F)ccc6 UNCHARGED CHI: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@Hn(nn5)cc5-c6cc(F)ccc6 MAE: c1ccc(F)cc1-c2cn(nn2)C@HC@@HC@HSC@@HC@HC@@Hn(nn5)cc5-c6cc(F)ccc6 UNCHARGED CANONICAL/ISOMERIC CHI: OC[C@H]1OC@@HC@HC@@H[C@H]1O MAE: OC[C@H]1OC@HC@HC@H[C@H]1O ORG: [H][C@@]1(S[C@]2([H])OC@(CO)C@(O)C@(N3C=C(C4=CC(F)=CC=C4)N=N3)[C@@]2([H])O)OC@(CO)C@(O)C@(N2C=C(C3=CC(F)=CC=C3)N=N2)[C@@]1([H])O CAN: OC[C@H]1OC@@HC@HC@@H[C@H]1O CHIMERA MATCHES MAESTRO DIFFERS ============================== 5ii2 CHI: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 lig_LU2_unprep_step2.mol2 MAE: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 UNCHARGED CHI: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 MAE: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 UNCHARGED CANONICAL/ISOMERIC CHI: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12 MAE: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12 ORG: O=C1C=C(C2=CC(O)=C(O)C=C2)OC2=CC(O)=CC(O)=C12 CAN: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12 CHIMERA MATCHES MAESTRO MATCHES ============================== 5kcv CHI: C1CCC1([NH3+])c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6 lig_6S1_unprep_step2.mol2 MAE: C1CCC1(N)c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6 UNCHARGED CHI: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1 MAE: C1CCC1(N)c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6 UNCHARGED CANONICAL/ISOMERIC CHI: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1 MAE: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1 ORG: NC1=C(C2=NC3=C(N=C(C4=CC=CC=C4)C=C3)N2C2=CC=C(C3(N)CCC3)C=C2)C=CC=N1 CAN: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzm CHI: Nc1ccnc(c1C)CN@H+CCc(c23)cccc3N lig_4X8_unprep_step2.mol2 MAE: Nc1ccnc(c1C)CN(C2)CCc(c23)cccc3N UNCHARGED CHI: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1 MAE: Nc1ccnc(c1C)CN(C2)CCc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1 MAE: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1 ORG: CC1=C(N)C=CN=C1CN1CCC2=C(C1)C(N)=CC=C2 CAN: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzp CHI: CNH+CCCN@@H+CCc(c12)cccc2N lig_4XG_unprep_step2.mol2 MAE: CN(C)CCCN(C1)CCc(c12)cccc2N UNCHARGED CHI: CN(C)CCCN1CCc2cccc(N)c2C1 MAE: CN(C)CCCN(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: CN(C)CCCN1CCc2cccc(N)c2C1 MAE: CN(C)CCCN1CCc2cccc(N)c2C1 ORG: CN(C)CCCN1CCC2=C(C1)C(N)=CC=C2 CAN: CN(C)CCCN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5kix CHI: O=P(O)(O)OC[C@@H]1C@@HCC@@Hn(c2)c(=O)nc(O)c2C lig_THP_unprep_step2.mol2 MAE: O=P(O)(O)OC[C@@H]1C@HCC@@Hn(c2)c(=O)nc(O)c2C UNCHARGED CHI: O=P(O)(O)OC[C@@H]1C@@HCC@@Hn(c2)c(=O)nc(O)c2C MAE: O=P(O)(O)OC[C@@H]1C@HCC@@Hn(c2)c(=O)nc(O)c2C UNCHARGED CANONICAL/ISOMERIC CHI: Cc1cn([C@H]2CC@HC@@HO2)c(=O)nc1O MAE: Cc1cn([C@H]2CC@@HC@@HO2)c(=O)nc1O ORG: [H][C@]1(OP(=O)(O)O)CC@(N2C=C(C)C(O)=NC2=O)O[C@]1([H])COP(=O)(O)O CAN: Cc1cn([C@H]2CC@HC@@HO2)c(=O)nc1O CHIMERA MATCHES MAESTRO DIFFERS ============================== 5bzr CHI: C1COCCC1OCCCN@H+CCc(c23)cccc3N lig_4XM_unprep_step2.mol2 MAE: C1COCCC1OCCCN(C2)CCc(c23)cccc3N UNCHARGED CHI: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2 MAE: C1COCCC1OCCCN(C2)CCc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2 MAE: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2 ORG: NC1=CC=CC2=C1CN(CCCOC1CCOCC1)CC2 CAN: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzf CHI: CCN@@H+Cc(c12)cccc2N lig_4X3_unprep_step2.mol2 MAE: CCN(CC1)Cc(c12)cccc2N UNCHARGED CHI: CCN1CCc2c(N)cccc2C1 MAE: CCN(CC1)Cc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: CCN1CCc2c(N)cccc2C1 MAE: CCN1CCc2c(N)cccc2C1 ORG: CCN1CCC2=C(C=CC=C2N)C1 CAN: CCN1CCc2c(N)cccc2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzj CHI: Cn1ccnc1CCN@@H+CCc(c23)cccc3N lig_4WN_unprep_step2.mol2 MAE: Cn1ccnc1CCN(C2)CCc(c23)cccc3N UNCHARGED CHI: Cn1ccnc1CCN1CCc2cccc(N)c2C1 MAE: Cn1ccnc1CCN(C2)CCc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Cn1ccnc1CCN1CCc2cccc(N)c2C1 MAE: Cn1ccnc1CCN1CCc2cccc(N)c2C1 ORG: CN1C=CN=C1CCN1CCC2=C(C1)C(N)=CC=C2 CAN: Cn1ccnc1CCN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzh CHI: [NH3+]CCCN@@H+Cc(c12)cccc2 lig_4X1_unprep_step2.mol2 MAE: NCCCN(CC1)Cc(c12)cccc2 UNCHARGED CHI: NCCCN1CCc2ccccc2C1 MAE: NCCCN(CC1)Cc(c12)cccc2 UNCHARGED CANONICAL/ISOMERIC CHI: NCCCN1CCc2ccccc2C1 MAE: NCCCN1CCc2ccccc2C1 ORG: NCCCN1CCC2=CC=CC=C2C1 CAN: NCCCN1CCc2ccccc2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzc CHI: CC1(C)C[NH2+]Cc(c12)cccc2 lig_4WT_unprep_step2.mol2 MAE: CC1(C)CNCc(c12)cccc2 UNCHARGED CHI: CC1(C)CNCc2ccccc21 MAE: CC1(C)CNCc(c12)cccc2 UNCHARGED CANONICAL/ISOMERIC CHI: CC1(C)CNCc2ccccc21 MAE: CC1(C)CNCc2ccccc21 ORG: CC1(C)CNCC2=CC=CC=C21 CAN: CC1(C)CNCc2ccccc21 CHIMERA MATCHES MAESTRO MATCHES ============================== 5f9b CHI: C1CC(C)(C)CC@HC=3C@@(C[C@H]2O)[C@]4(C)C@H[C@]5(C)C@@HC@(CO)C@@HCC5 lig_5VN_unprep_step2.mol2 MAE: C1CC(C)(C)CC@@HC=3C@(C[C@H]2O)[C@@]4(C)C@H[C@]5(C)C@@HC@(CO)C@HCC5 UNCHARGED CHI: C1CC(C)(C)CC@HC=3C@@(C[C@H]2O)[C@]4(C)C@H[C@]5(C)C@@HC@(CO)C@@HCC5 MAE: C1CC(C)(C)CC@@HC=3C@(C[C@H]2O)[C@@]4(C)C@H[C@]5(C)C@@HC@(CO)C@HCC5 UNCHARGED CANONICAL/ISOMERIC CHI: CC1(C)CC[C@@]2(C(=O)O)C@@HC1=CC[C@H]3C@(CC[C@H]4C@(CO)C@@HCC[C@]34C)[C@]1(C)C[C@H]2O MAE: CC1(C)CC[C@]2(C(=O)O)C@HC[C@@]3(C)C(=CC[C@@H]4[C@@]5(C)CCC@@HC@@(CO)[C@@H]5CC[C@]43C)[C@H]2C1 ORG: [H][C@]1(O)CC[C@@]2(C)C@@(CC[C@@]3(C)[C@]4(C)CC@@(O)[C@@]5(C(=O)O)CCC(C)(C)C[C@@]5([H])C4=CC[C@@]32[H])[C@]1(C)CO CAN: CC1(C)CC[C@@]2(C(=O)O)C@@HC1=CC[C@@H]3[C@@]4(C)CCC@HC@@(CO)[C@@H]4CC[C@@]3(C)[C@]1(C)C[C@H]2O CHIMERA DIFFERS MAESTRO DIFFERS ============================== 5iid CHI: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 lig_6BK_unprep_step2.mol2 MAE: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 UNCHARGED CHI: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 MAE: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 UNCHARGED CANONICAL/ISOMERIC CHI: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12 MAE: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12 ORG: O=C1C=C(C2=CC(O)=C(O)C=C2)OC2=CC=CC(O)=C12 CAN: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12 CHIMERA MATCHES MAESTRO MATCHES ============================== 4zk6 CHI: O=C(O)c1c(C(=O)[OH-])nccc1 lig_NTM_unprep_step2.mol2 MAE: O=C(O)c1c(C(=O)O)nccc1 UNCHARGED [23:35:43] Explicit valence for atom # 7 O, 3, is greater than permitted ERROR. SKIPPING MOL ============================= 1fcz CHI: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C lig_156_unprep_step2.mol2 MAE: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C UNCHARGED CHI: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C MAE: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C UNCHARGED CANONICAL/ISOMERIC CHI: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21 MAE: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21 ORG: [H]/C(C(=O)C1=CC2=C(C=C1)C(C)(C)CCC2(C)C)=C(/[H])C1=CC=C(C(=O)O)C=C1 CAN: CC1(C)CCC(C)(C)c2cc(C(=O)C=Cc3ccc(C(=O)O)cc3)ccc21 CHIMERA DIFFERS MAESTRO DIFFERS ============================== 5bzg CHI: C1CN@H+Cc(c12)cccc2N lig_4X6_unprep_step2.mol2 MAE: C1CN(C)Cc(c12)cccc2N UNCHARGED CHI: CN1CCc2c(N)cccc2C1 MAE: C1CN(C)Cc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: CN1CCc2c(N)cccc2C1 MAE: CN1CCc2c(N)cccc2C1 ORG: CN1CCC2=C(C=CC=C2N)C1 CAN: CN1CCc2c(N)cccc2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzq CHI: C1COCC[NH+]1CCCN@H+CCc(c23)cccc3N lig_4XL_unprep_step2.mol2 MAE: C1COCCN1CCCN(C2)CCc(c23)cccc3N UNCHARGED CHI: Nc1cccc2c1CN(CCCN1CCOCC1)CC2 MAE: C1COCCN1CCCN(C2)CCc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Nc1cccc2c1CN(CCCN1CCOCC1)CC2 MAE: Nc1cccc2c1CN(CCCN1CCOCC1)CC2 ORG: NC1=CC=CC2=C1CN(CCCN1CCOCC1)CC2 CAN: Nc1cccc2c1CN(CCCN1CCOCC1)CC2 CHIMERA MATCHES MAESTRO MATCHES ============================== 5ii1 CHI: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 lig_6BL_unprep_step2.mol2 MAE: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 UNCHARGED CHI: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 MAE: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 UNCHARGED CANONICAL/ISOMERIC CHI: Cc1[nH]nc2oc(=O)c3ccccc3c12 MAE: Cc1[nH]nc2oc(=O)c3ccccc3c12 ORG: CC1=C2C3=CC=CC=C3C(=O)OC2=NN1 CAN: Cc1[nH]nc2oc(=O)c3ccccc3c12 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzn CHI: COCCN@H+CCc(c12)cccc2N lig_4XD_unprep_step2.mol2 MAE: COCCN(C1)CCc(c12)cccc2N UNCHARGED CHI: COCCN1CCc2cccc(N)c2C1 MAE: COCCN(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: COCCN1CCc2cccc(N)c2C1 MAE: COCCN1CCc2cccc(N)c2C1 ORG: COCCN1CCC2=C(C1)C(N)=CC=C2 CAN: COCCN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== ============================= SUMMARY ============================= CHIMERA MATCHES: 24 MAESTRO MATCHES: 20 |

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232273458, or mute the thread https://github.com/notifications/unsubscribe/AQEJQD80QgZxGYgnQywiy270T29HIm1-ks5qVI4JgaJpZM4JEjlJ.

j-wags commented 7 years ago

The purpose of this test was originally to ensure that the chimera/rdkit pipeline was creating molecules with the correct 3d geometry. We check by taking the 3d results of each prep and converting them to SMILES, which contains their stereo/structural information. Then we compare the SMILES of the 3d molecule to the original to ensure that the right chiralities came out. Since it's trivial to include the schrodinger-prepped molecules, I put them in too.

There may be an easier way to check for chemical identity, but the SMILES comparison route was the first that came to mind. I'm still only 75% sure that there isn't some weird technicality hiding in the conversion routines we're using, but I get a little more assurance from the fact that babel and schrodinger structconvert always yield the same SMILES for the each molecule in this set (data not shown). We could further eliminate doubt by testing RDKit as a third contender for 3d-->smiles conversion.

Good call on the inchis - we may be losing something when we do our initial conversion of them to smiles. They'd have their stereo information at the end of the inchi string, after the atoms and before the isotope block. Checking that now.

j-wags commented 7 years ago

(still in progress)

Some inchis on the pdb do have defined cis/trans stereochemistry. The errors that we're seeing are a bit complicated. I'll go through case by case:

5bzo's ligand, 4XC PDB structure: http://www.rcsb.org/pdb/images/4XC_600.gif PDB inchi: 1S/C13H19N3O/c1-15-13(17)6-8-16-7-5-10-3-2-4-12(14)11(10)9-16/h2-4H,5-9,14H2,1H3,(H,15,17) (note: no cis/trans markers at the end) In this case, the peptide bond on the molecule's tail (between CAM and NAL) becomes double upon any sort of molecule processing (literally any - even CELPP's inchi-->smiles routine moves the bond), and therefore requires a cis/trans definition.

When exactly did this bond move?

We downloaded the inchi correctly - the one in stage.3.blastnfilter/5bzo.txt matches exactly that on the pdb ligand page. The smiles in stage.4.challengedata/celpp_week26_2016/5bzo/lig_4XC.smi has the bond moved, however the inchi in stage.4.challengedata/celpp_week26_2016/5bzo/lig_4XC.inchi is still identical to the pdb one. Therefore the bond moved in the inchi --> smiles conversion in stage 4.

Was there an error/warning message?

No - The output from this conversion step wasn't piped to log (+1 for the issue ticket for switching out commands.getoutput)

What was the exact code that did this step? In genchallengedata.py:

def generate_ligand (inchi, ligand_title, ):
    valid_inchi = "InChI="+inchi
    rd_mol = Chem.MolFromInchi(format(valid_inchi), removeHs=False, sanitize=False, treatWarningAsError=True)
    smiles = Chem.MolToSmiles(rd_mol, isomericSmiles=True)

Currently playing with this code to see where exactly the switch happens.

j-wags commented 7 years ago

Followup thought - Does this even matter? A good molecule interpreter should see the resonance forms in this situation, so the docking FF shouldn't care. A conformer generator might though. But the bond is partially double, so it should be planar. This may be a good discussion point in the celpp meeting.

mkgilson commented 7 years ago

Hi Jeff

What do you mean when you say the bond was "moved"?

Also, it looks like Chem.MolFromInchi is a library call of some sort. If so, what is the library?

thanks Mike

On 7/13/2016 11:28 AM, j-wags wrote:

(still in progress)

Some inchis on the pdb do have defined cis/trans stereochemistry. The errors that we're seeing are a bit complicated. I'll go through case by case:

5bzo's ligand, 4XC http://www.rcsb.org/pdb/ligand/ligandsummary.do?hetId=4XC&sid=5BZOPDB structure: http://www.rcsb.org/pdb/images/4XC_600.gif PDB inchi: 1S/C13H19N3O/c1-15-13(17)6-8-16-7-5-10-3-2-4-12(14)11(10)9-16/h2-4H,5-9,14H2,1H3,(H,15,17) (note: no cis/trans markers at the end) In this case, the peptide bond on the molecule's tail (between CAM and NAL) becomes double upon any sort of molecule processing (literally any - even CELPP's inchi-->smiles routine moves the bond), and therefore requires a cis/trans definition.

When exactly did this bond move?

We downloaded the inchi correctly - the one in |stage.3.blastnfilter/5bzo.txt| matches exactly that on the pdb ligand page. The smiles in |stage.4.challengedata/celpp_week26_2016/5bzo/lig_4XC.smi| has the bond moved, however the inchi in |stage.4.challengedata/celpp_week26_2016/5bzo/lig_4XC.inchi| is still identical to the pdb one. Therefore the bond moved in the inchi --> smiles conversion in stage 4.

Was there an error/warning message?

No - The output from this conversion step wasn't piped to log (+1 for the issue ticket for switching out commands.getoutput)

What was the exact code that did this step? In genchallengedata.py:

|def generate_ligand (inchi, ligand_title, ): valid_inchi = "InChI="+inchi rd_mol = Chem.MolFromInchi(format(valid_inchi), removeHs=False, sanitize=False, treatWarningAsError=True) smiles = Chem.MolToSmiles(rd_mol, isomericSmiles=True) |

Currently playing with this code to see where exactly the switch happens.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232444999, or mute the thread https://github.com/notifications/unsubscribe/AQEJQN39-oUnNn9KNzKymMeHuxvlYec6ks5qVS5igaJpZM4JEjlJ.

Michael K. Gilson, M.D., Ph.D. Professor, Skaggs School of Pharmacy and Pharmaceutical Sciences Co-Director, UCSD Center for Drug Discovery Innovation U. C. San Diego 9500 Gilman Drive Pharmaceutical Sciences Building, Room 3224 La Jolla, CA 92093-0736 Voice: 858-822-0622 Fax: 858-822-7726 http://gilson.ucsd.edu http://www.bindingdb.org http://drugdiscovery.ucsd.edu http://drugdesigndata.org

shuail commented 7 years ago

Hi all,

I wrote the code below:

def generate_ligand (inchi, ligand_title, ): valid_inchi = "InChI="+inchi rd_mol = Chem.MolFromInchi(format(valid_inchi), removeHs=False, sanitize=False, treatWarningAsError=True) smiles = Chem.MolToSmiles(rd_mol, isomericSmiles=True)

And it's the RDKit tool to convert the inchi to the RDKit mol object and then write out to the smile string. Yes, I saw some cases where the final smile string has undefined stereo information. While But I am trying to find methods to check if the inchi string also has this undefined stereo information at this point with the given inchi string.

Thanks, Shuai

On Wed, Jul 13, 2016 at 11:49 AM, mkgilson notifications@github.com wrote:

Hi Jeff

What do you mean when you say the bond was "moved"?

Also, it looks like Chem.MolFromInchi is a library call of some sort. If so, what is the library?

thanks Mike

On 7/13/2016 11:28 AM, j-wags wrote:

(still in progress)

Some inchis on the pdb do have defined cis/trans stereochemistry. The errors that we're seeing are a bit complicated. I'll go through case by case:

5bzo's ligand, 4XC http://www.rcsb.org/pdb/ligand/ligandsummary.do?hetId=4XC&sid=5BZOPDB structure: http://www.rcsb.org/pdb/images/4XC_600.gif

PDB inchi:

1S/C13H19N3O/c1-15-13(17)6-8-16-7-5-10-3-2-4-12(14)11(10)9-16/h2-4H,5-9,14H2,1H3,(H,15,17) (note: no cis/trans markers at the end) In this case, the peptide bond on the molecule's tail (between CAM and NAL) becomes double upon any sort of molecule processing (literally any - even CELPP's inchi-->smiles routine moves the bond), and therefore requires a cis/trans definition.

When exactly did this bond move?

We downloaded the inchi correctly - the one in |stage.3.blastnfilter/5bzo.txt| matches exactly that on the pdb ligand page. The smiles in |stage.4.challengedata/celpp_week26_2016/5bzo/lig_4XC.smi| has the bond moved, however the inchi in |stage.4.challengedata/celpp_week26_2016/5bzo/lig_4XC.inchi| is still identical to the pdb one. Therefore the bond moved in the inchi --> smiles conversion in stage 4.

Was there an error/warning message?

No - The output from this conversion step wasn't piped to log (+1 for the issue ticket for switching out commands.getoutput)

What was the exact code that did this step? In genchallengedata.py:

|def generate_ligand (inchi, ligand_title, ): valid_inchi = "InChI="+inchi rd_mol = Chem.MolFromInchi(format(valid_inchi), removeHs=False, sanitize=False, treatWarningAsError=True) smiles = Chem.MolToSmiles(rd_mol, isomericSmiles=True) |

Currently playing with this code to see where exactly the switch happens.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232444999, or mute the thread < https://github.com/notifications/unsubscribe/AQEJQN39-oUnNn9KNzKymMeHuxvlYec6ks5qVS5igaJpZM4JEjlJ .

Michael K. Gilson, M.D., Ph.D. Professor, Skaggs School of Pharmacy and Pharmaceutical Sciences Co-Director, UCSD Center for Drug Discovery Innovation U. C. San Diego 9500 Gilman Drive Pharmaceutical Sciences Building, Room 3224 La Jolla, CA 92093-0736 Voice: 858-822-0622 Fax: 858-822-7726 http://gilson.ucsd.edu http://www.bindingdb.org http://drugdiscovery.ucsd.edu http://drugdesigndata.org

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232450710, or mute the thread https://github.com/notifications/unsubscribe/ALrMA46fZKt7f4ZTxr9g7ohsre4Jru_nks5qVTM2gaJpZM4JEjlJ .

mkgilson commented 7 years ago

Thanks, Shuai. Mike

On 7/13/2016 12:02 PM, shuail wrote:

Hi all,

I wrote the code below:

def generate_ligand (inchi, ligand_title, ): valid_inchi = "InChI="+inchi rd_mol = Chem.MolFromInchi(format(valid_inchi), removeHs=False, sanitize=False, treatWarningAsError=True) smiles = Chem.MolToSmiles(rd_mol, isomericSmiles=True)

And it's the RDKit tool to convert the inchi to the RDKit mol object and then write out to the smile string. Yes, I saw some cases where the final smile string has undefined stereo information. While But I am trying to find methods to check if the inchi string also has this undefined stereo information at this point with the given inchi string.

Thanks, Shuai

On Wed, Jul 13, 2016 at 11:49 AM, mkgilson notifications@github.com wrote:

Hi Jeff

What do you mean when you say the bond was "moved"?

Also, it looks like Chem.MolFromInchi is a library call of some sort. If so, what is the library?

thanks Mike

On 7/13/2016 11:28 AM, j-wags wrote:

(still in progress)

Some inchis on the pdb do have defined cis/trans stereochemistry. The errors that we're seeing are a bit complicated. I'll go through case by case:

5bzo's ligand, 4XC

http://www.rcsb.org/pdb/ligand/ligandsummary.do?hetId=4XC&sid=5BZOPDB structure: http://www.rcsb.org/pdb/images/4XC_600.gif

PDB inchi:

1S/C13H19N3O/c1-15-13(17)6-8-16-7-5-10-3-2-4-12(14)11(10)9-16/h2-4H,5-9,14H2,1H3,(H,15,17)

(note: no cis/trans markers at the end) In this case, the peptide bond on the molecule's tail (between CAM and NAL) becomes double upon any sort of molecule processing (literally any - even CELPP's inchi-->smiles routine moves the bond), and therefore requires a cis/trans definition.

When exactly did this bond move?

We downloaded the inchi correctly - the one in |stage.3.blastnfilter/5bzo.txt| matches exactly that on the pdb ligand page. The smiles in |stage.4.challengedata/celpp_week26_2016/5bzo/lig_4XC.smi| has the bond moved, however the inchi in |stage.4.challengedata/celpp_week26_2016/5bzo/lig_4XC.inchi| is still identical to the pdb one. Therefore the bond moved in the inchi --> smiles conversion in stage 4.

Was there an error/warning message?

No - The output from this conversion step wasn't piped to log (+1 for the issue ticket for switching out commands.getoutput)

What was the exact code that did this step? In genchallengedata.py:

|def generate_ligand (inchi, ligand_title, ): valid_inchi = "InChI="+inchi rd_mol = Chem.MolFromInchi(format(valid_inchi), removeHs=False, sanitize=False, treatWarningAsError=True) smiles = Chem.MolToSmiles(rd_mol, isomericSmiles=True) |

Currently playing with this code to see where exactly the switch happens.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232444999, or mute the thread <

https://github.com/notifications/unsubscribe/AQEJQN39-oUnNn9KNzKymMeHuxvlYec6ks5qVS5igaJpZM4JEjlJ

.

Michael K. Gilson, M.D., Ph.D. Professor, Skaggs School of Pharmacy and Pharmaceutical Sciences Co-Director, UCSD Center for Drug Discovery Innovation U. C. San Diego 9500 Gilman Drive Pharmaceutical Sciences Building, Room 3224 La Jolla, CA 92093-0736 Voice: 858-822-0622 Fax: 858-822-7726 http://gilson.ucsd.edu http://www.bindingdb.org http://drugdiscovery.ucsd.edu http://drugdesigndata.org

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232450710, or mute the thread

https://github.com/notifications/unsubscribe/ALrMA46fZKt7f4ZTxr9g7ohsre4Jru_nks5qVTM2gaJpZM4JEjlJ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232454242, or mute the thread https://github.com/notifications/unsubscribe/AQEJQJjoAVWceLRPI01YGTxwuQFfRB6kks5qVTZDgaJpZM4JEjlJ.

Michael K. Gilson, M.D., Ph.D. Professor, Skaggs School of Pharmacy and Pharmaceutical Sciences Co-Director, UCSD Center for Drug Discovery Innovation U. C. San Diego 9500 Gilman Drive Pharmaceutical Sciences Building, Room 3224 La Jolla, CA 92093-0736 Voice: 858-822-0622 Fax: 858-822-7726 http://gilson.ucsd.edu http://www.bindingdb.org http://drugdiscovery.ucsd.edu http://drugdesigndata.org

j-wags commented 7 years ago

Shuai and I just looked into this. It comes down to how the inchi and smiles conventions consider conformational isomerism around peptide bonds. Essentially, the question comes down to where the double bond is drawn (C=O or C=N) in a peptide bond's smiles string.

If the peptide bond smiles were written with the double bond at C=O, there is no need (according to the smiles format) to define cis/trans stereochemistry. If the peptide bond smiles is written as C=N, then it is possible to indicate stereochemistry.

The original PDB inchi (by convention) does not contain explicit bond order information, and it chooses not to put a cis/trans identifier on the peptide bond. The smiles we release (for some reason) puts the double bond on C=N, but does not put cis/trans markers on the bond, so this should be equivalent.

Since this test was generating smiles from 3d structures, and both of our 3d-->smiles converters chose to put the double bond at C=N, they DID contain a cis/trans identifier. That's why the exact smiles-matching test failed for the 5bzo ligands.

This does raise the question of what to do if we run into undefined stereochemistry in pdb pre-releases (or maybe this will never happen). I'll look into the other cis/trans failures in these test cases.

mkgilson commented 7 years ago

Though peptides don't like to flip cis/trans, they can do it. So, maybe instead of trying to resolve this at the cheminformatics level, we could include peptide cis/trans isomerism in any conformational analysis.

A somewhat related issue is: what are we doing with ring puckers? Of course these wouldn't, and shouldn't, be defined in Inchi or smiles, but they can require careful attention when one is preparing ligands to be docked.

Best, Mike

On 7/13/2016 2:33 PM, j-wags wrote:

Shuai and I just looked into this. It comes down to how the inchi and smiles conventions consider conformational isomerism around peptide bonds. Essentially, the question comes down to where the double bond is drawn (C=O or C=N) in a peptide bond's smiles string.

If the peptide bond smiles were written with the double bond at C=O, there is no need (according to the smiles format) to define cis/trans stereochemistry. If the peptide bond smiles is written as C=N, then it is possible to indicate stereochemistry.

The original PDB inchi (by convention) does not contain explicit bond order information, and it chooses not to put a cis/trans identifier on the peptide bond. The smiles we release (for some reason) puts the double bond on C=N, but does not put cis/trans markers on the bond, so this should be equivalent.

Since this test was generating smiles from 3d structures, and both of our 3d-->smiles converters chose to put the double bond at C=N, they DID contain a cis/trans identifier. That's why the exact smiles-matching test failed for the 5bzo ligands.

This does raise the question of what to do if we run into undefined stereochemistry in pdb pre-releases (or maybe this will never happen). I'll look into the other cis/trans failures in these test cases.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232493456, or mute the thread https://github.com/notifications/unsubscribe/AQEJQISVU9saySMohNhKy_y1mOH8syIpks5qVVmTgaJpZM4JEjlJ.

Michael K. Gilson, M.D., Ph.D. Professor, Skaggs School of Pharmacy and Pharmaceutical Sciences Co-Director, UCSD Center for Drug Discovery Innovation U. C. San Diego 9500 Gilman Drive Pharmaceutical Sciences Building, Room 3224 La Jolla, CA 92093-0736 Voice: 858-822-0622 Fax: 858-822-7726 http://gilson.ucsd.edu http://www.bindingdb.org http://drugdiscovery.ucsd.edu http://drugdesigndata.org

j-wags commented 7 years ago

Agreed. A good SMILES/molecule interpreter should recognize and know what to do with peptide bonds, regardless of how they're written. I think that the 5bzo problem is closed as "not our problem".

WRT ring puckers and conformer generation: right now we're generating one state of each molecule for docking input. That is; one conformation, one charge state. Schrodinger/GLIDE may do some limited conf. gen during docking, and I'm pretty sure that ligprep has flags if we want to start docking multiple conformers/ionizations. Both charge state and conformer generation for the vina workflow are still an outstanding issue - the docking is flexible (as far as it can identify rotatable torsions), but we do not generate diverse initial conformers or more than one charge state.

Organizationally, we may want to start keeping conf. gen notes under another issue ticket. More broadly, we may consider if we want this issue tracker to continue being a discussion place for scientific questions.

mkgilson commented 7 years ago

Makes sense.

Regarding the ticket-- your call!

Thanks, Mike

On 7/13/2016 2:51 PM, j-wags wrote:

Agreed. A good SMILES/molecule interpreter should recognize and know what to do with peptide bonds, regardless of how they're written. I think that the 5bzo problem is closed as "not our problem".

WRT ring puckers and conformer generation: right now we're generating one state of each molecule for docking input. That is; one conformation, one charge state. Schrodinger/GLIDE may do some limited conf. gen during docking, and I'm pretty sure that ligprep has flags if we want to start docking multiple conformers/ionizations. Both charge state and conformer generation for the vina workflow are still an outstanding issue - the docking is flexible (as far as it can identify rotatable torsions), but we do not generate diverse initial conformers or more than one charge state.

Organizationally, we may want to start keeping conf. gen notes under another issue ticket. More broadly, we may consider if we want this issue tracker to continue being a discussion place for scientific questions.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232497655, or mute the thread https://github.com/notifications/unsubscribe/AQEJQKQrajEm-LxenV7U3Hn7tHeImDr6ks5qVV3FgaJpZM4JEjlJ.

Michael K. Gilson, M.D., Ph.D. Professor, Skaggs School of Pharmacy and Pharmaceutical Sciences Co-Director, UCSD Center for Drug Discovery Innovation U. C. San Diego 9500 Gilman Drive Pharmaceutical Sciences Building, Room 3224 La Jolla, CA 92093-0736 Voice: 858-822-0622 Fax: 858-822-7726 http://gilson.ucsd.edu http://www.bindingdb.org http://drugdiscovery.ucsd.edu http://drugdesigndata.org

j-wags commented 7 years ago

5a5d/5LC fails because it defined stereo around a peptide bond (cosmetic difference) 4ym9/4E4 fails because it defined stereo around a peptide bond (cosmetic difference) 1fcz/156 is an interesting case. It has a defined cis/trans stereocenter (in the middle of a linker with resonance... ugh!). Will look into this after the meeting.

mkgilson commented 7 years ago

Hi Jeff,

All these stereo centers are cis/trans?

thx Mike

On 7/13/2016 3:30 PM, j-wags wrote:

5a5d/5LC fails because it defined stereo around a peptide bond (cosmetic difference) 4ym9/4E4 fails because it defined stereo around a peptide bond (cosmetic difference) 1fcz/156 is an interesting case. It has a defined cis/trans stereocenter (in the middle of a linker with resonance... ugh!). Will look into this after the meeting.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232506131, or mute the thread https://github.com/notifications/unsubscribe/AQEJQNbaNY0HmmSWlLCUgp18XCx9Q6t_ks5qVWcQgaJpZM4JEjlJ.

j-wags commented 7 years ago

Summary: The 1fcz rabbit hole goes kinda deep. It is ultimately due to a problem with my testing framework, caused by a bug in rdkit that was fixed in the march 2016 release. The rdkit on nif1 is the 2015.03.1 version. We need to update rdkit to 2016.03.1 or .2

Details In the case of the 1fcz lig "156" (our internal standard), the original inchi has defined stereochemistry about a double bond.

(I know, looking at the structure, that entire linker is probably resonant, but let's just pretend that we want to get an accurate smiles string of the pdb website molecule)

How did we lose track of the bond?

After much investigation, I've concluded that this was caused by hydrogen removal. Hydrogen removal, in itself, shouldn't have deleted the stereobond information. However, the stereobond information in a smiles string is attached to the 1 and 4 atoms (if we index the stereobond like a dihedral, defined by atoms 1-2-3-4). Our inchi-->smiles conversion (done in rdkit) decided to put the stereobond information on a pair of explicit H's. This isn't wrong - It's equally valid to put the cis/trans labels ("/" or "\") on any 1-4 atoms about the double bond.

156, representing trans conformation with "/"s on explicit hydrogens: [H]/C(C(=O)c1ccc2c(c1)C(C)(C)CCC2(C)C)=C(/[H])c1ccc(C(=O)O)cc1 156 (the same molecule), representing the trans conformation with "/"s on the bordering heavy atoms CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21

Here's where things went wrong, though - in my testing framework, and in chimera_proteinligprep using the 2015.03.1 version of rdkit, sanitization and canonicalization of the smiles (for more complicated reasons) would lose these hydrogens, and throw the stereobond information out with them.

This problem was noted in feburary of 2016 on the rdkit bug tracker and fixed. https://github.com/rdkit/rdkit/issues/754

In this case, chimera_proteinlig prep still decided to generate that double bond as trans, but it didn't have the explicit instructions to do so. Until we update rdkit, we may lose information there.

j-wags commented 7 years ago

Mike, I've been thinking about that question and what the results would imply.

The three peptide bonds in our tests were trans. Energetically speaking, I think that something like 99.9% of peptide bonds should be trans. This doesn't exclude the possibility that a ligand could be cis (and the weirdness of a cis peptide ligand may attract more scientific attention), so there may be a handful of such ligands in the pdb. We would want to pass on that information in the smiles if it were the case.

Without an example molecule, I can't say whether our workflow will gracefully handle a cis peptide bond. I'm going to move onto other tickets right now, but we may want to reopen this issue later.

mkgilson commented 7 years ago

who woulda thunk cheminformatics could be so complicated. Thanks for figuring this out!

Mike

On 7/14/2016 2:46 PM, j-wags wrote:

Summary: The 1fcz rabbit hole goes kinda deep. It is ultimately due to a problem with my testing framework, caused by a bug in rdkit that was fixed in the march 2016 release. The rdkit on nif1 is the 2015.03.1 version. We need to update rdkit to 2016.03.1 or .2

Details In the case of the 1fcz lig "156" http://www.rcsb.org/pdb/ligand/ligandsummary.do?hetId=156&sid=1FCZ (our internal standard), the original inchi has defined stereochemistry about a double bond.

(I know, looking at the structure, that entire linker is probably resonant, but let's just pretend that we want to get an accurate smiles string of the pdb website molecule)

How did we lost track of the bond?

After much investigation, I've concluded that this was caused by hydrogen removal. Hydrogen removal, in itself, shouldn't have deleted the stereobond information. However, the stereobond information in a smiles string is attached to the 1 and 4 atoms (if we index the stereobond like a dihedral, defined by atoms 1-2-3-4). Our inchi-->smiles conversion (done in rdkit) decided to put the stereobond information on a pair of explicit H's. This isn't wrong - It's equally valid to put the cis/trans labels ("/" or "\") on any 1-4 atoms about the double bond.

156, representing trans conformation with "/"s on explicit hydrogens: |[H]/C(C(=O)c1ccc2c(c1)C(C)(C)CCC2(C)C)=C(/[H])c1ccc(C(=O)O)cc1| 156 (the same molecule), representing the trans conformation with "/"s on the bordering heavy atoms |CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21|

Here's where things went wrong, though - in my testing framework, and in chimera_proteinligprep using the 2015.03.1 version of rdkit, sanitization and canonicalization of the smiles (for more complicated reasons) would lose these hydrogens, and throw the stereobond information out with them.

This problem was noted in feburary of 2016 on the rdkit bug tracker and fixed. rdkit/rdkit#754 https://github.com/rdkit/rdkit/issues/754

In this case, chimera_proteinlig prep still decided to generate that double bond as trans, but it didn't have the explicit instructions to do so. Until we update rdkit, we may lose information there.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-232802212, or mute the thread https://github.com/notifications/unsubscribe/AQEJQDIwbB6dgwnAyarardl9WalgcCemks5qVq5EgaJpZM4JEjlJ.

Michael K. Gilson, M.D., Ph.D. Professor, Skaggs School of Pharmacy and Pharmaceutical Sciences Co-Director, UCSD Center for Drug Discovery Innovation U. C. San Diego 9500 Gilman Drive Pharmaceutical Sciences Building, Room 3224 La Jolla, CA 92093-0736 Voice: 858-822-0622 Fax: 858-822-7726 http://gilson.ucsd.edu http://www.bindingdb.org http://drugdiscovery.ucsd.edu http://drugdesigndata.org

j-wags commented 7 years ago

Taking a break from this ticket for now. Conclusions for when I pick this back up are:

Therefore, of the original list, the remaining to-do list on this ticket is:

5f9b - disagreement in stereochemistry for molecule with 10 chiral centers and numerous ring closures
4zk6 - chimera/rdkit workflow makes a hydroxyl radical (?) and crashes my analysis script
j-wags commented 7 years ago

Summary: I've re-tested using the most current versions of the code. The new version of rdkit/chimera_proteinligprep.py gets 28(-1)/29 cases correct (it generates incorrect steroid chirality on 5f9b, and a (-1) because actually it messes up so badly on 4zk6 that the scorer throws it out of the contest, so really that's another failure). Maestro now appears to get 29/29 ligand generation cases right. I'm closing this ticket for now and opening a more concise ticket for 4zk6 and 5f9b

Detailed data dump: Note that 3 "failures" are incorrect and due to peptide bonds, and that chimera_proteinligprep messes up so badly on 4zk6 that the scorer skips it (that's why only 29 molecules are attempted)

j5wagner@nif1$ pwd
/var/home/j5wagner/2016_06_30_bakerLabOHPTest/week26Test_jul27
j5wagner@nif1$ /var/home/j5wagner/miniconda2/bin/python compareToMaestro.py 
5a5d

CHI: c1ccc(O)c(O)c1/C(O)=N/CCCCC\N=C(O)\c2c(O)c(O)ccc2 lig_5LC_unprep_step2.mol2
MAE: c1ccc(O)c(O)c1/C(O)=N\CCCCC/N=C(O)\c2c(O)c(O)ccc2

UNCHARGED
CHI: c1ccc(O)c(O)c1/C(O)=N/CCCCC\N=C(O)\c2c(O)c(O)ccc2
MAE: c1ccc(O)c(O)c1/C(O)=N\CCCCC/N=C(O)\c2c(O)c(O)ccc2

UNCHARGED CANONICAL/ISOMERIC
CHI: O/C(=N\CCCCC/N=C(\O)c1cccc(O)c1O)c1cccc(O)c1O
MAE: O/C(=N/CCCCC/N=C(/O)c1cccc(O)c1O)c1cccc(O)c1O

ORG: OC1=CC=CC(C(O)=NCCCCCN=C(O)C2=C(O)C(O)=CC=C2)=C1O
CAN: OC(=NCCCCCN=C(O)c1cccc(O)c1O)c1cccc(O)c1O

CHIMERA DIFFERS
MAESTRO DIFFERS
==============================

5c5a

CHI: OC1=NCCN(C1)C(=O)N2[C@H](c3ccc(Cl)cc3)[C@H](c4ccc(Cl)cc4)N=C2c5c(OC(C)C)cc(cc5)OC lig_NUT_unprep_step2.mol2
MAE: OC1=NCCN(C1)C(=O)N2[C@H](c3ccc(Cl)cc3)[C@H](c4ccc(Cl)cc4)N=C2c5c(OC(C)C)cc(cc5)OC

UNCHARGED
CHI: OC1=NCCN(C1)C(=O)N2[C@H](c3ccc(Cl)cc3)[C@H](c4ccc(Cl)cc4)N=C2c5c(OC(C)C)cc(cc5)OC
MAE: OC1=NCCN(C1)C(=O)N2[C@H](c3ccc(Cl)cc3)[C@H](c4ccc(Cl)cc4)N=C2c5c(OC(C)C)cc(cc5)OC

UNCHARGED CANONICAL/ISOMERIC
CHI: COc1ccc(C2=N[C@@H](c3ccc(Cl)cc3)[C@@H](c3ccc(Cl)cc3)N2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1
MAE: COc1ccc(C2=N[C@@H](c3ccc(Cl)cc3)[C@@H](c3ccc(Cl)cc3)N2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1

ORG: [H][C@@]1(C2=CC=C(Cl)C=C2)N=C(C2=C(OC(C)C)C=C(OC)C=C2)N(C(=O)N2CCN=C(O)C2)[C@]1([H])C1=CC=C(Cl)C=C1
CAN: COc1ccc(C2=N[C@@H](c3ccc(Cl)cc3)[C@@H](c3ccc(Cl)cc3)N2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

4ym9

CHI: CCC(CC)(CO)/C(O)=N/c(cn1)ccc1C lig_4E4_unprep_step2.mol2
MAE: CCC(CC)(CO)/C(O)=N\c(cn1)ccc1C

UNCHARGED
CHI: CCC(CC)(CO)/C(O)=N/c(cn1)ccc1C
MAE: CCC(CC)(CO)/C(O)=N\c(cn1)ccc1C

UNCHARGED CANONICAL/ISOMERIC
CHI: CCC(CC)(CO)/C(O)=N/c1ccc(C)nc1
MAE: CCC(CC)(CO)/C(O)=N\c1ccc(C)nc1

ORG: CCC(CC)(CO)C(O)=NC1=CN=C(C)C=C1
CAN: CCC(CC)(CO)C(O)=Nc1ccc(C)nc1

CHIMERA DIFFERS
MAESTRO DIFFERS
==============================

5h9r

CHI: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@@H](O)[C@@H](O)[C@H]4CO lig_TGZ_unprep_step2.mol2
MAE: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@@H](O)[C@@H](O)[C@H]4CO

UNCHARGED
CHI: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@@H](O)[C@@H](O)[C@H]4CO
MAE: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@@H](O)[C@@H](O)[C@H]4CO

UNCHARGED CANONICAL/ISOMERIC
CHI: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](O)[C@H]1O
MAE: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](O)[C@H]1O

ORG: [H][C@@]1(S[C@]2([H])O[C@]([H])(CO)[C@]([H])(O)[C@]([H])(N3C=C(C4=CC(F)=CC=C4)N=N3)[C@@]2([H])O)O[C@]([H])(CO)[C@]([H])(O)[C@]([H])(O)[C@@]1([H])O
CAN: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](O)[C@H]1O

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzl

CHI: COC(=O)CC[N@H+](C1)CCc(c12)cccc2N lig_4WO_unprep_step2.mol2
MAE: COC(=O)CCN(C1)CCc(c12)cccc2N

UNCHARGED
CHI: COC(=O)CCN1CCc2cccc(N)c2C1
MAE: COC(=O)CCN(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: COC(=O)CCN1CCc2cccc(N)c2C1
MAE: COC(=O)CCN1CCc2cccc(N)c2C1

ORG: COC(=O)CCN1CCC2=C(C1)C(N)=CC=C2
CAN: COC(=O)CCN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzi

CHI: C1[NH2+]CCc(c12)cccc2N lig_4WU_unprep_step2.mol2
MAE: C1NCCc(c12)cccc2N

UNCHARGED
CHI: Nc1cccc2c1CNCC2
MAE: C1NCCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1cccc2c1CNCC2
MAE: Nc1cccc2c1CNCC2

ORG: NC1=CC=CC2=C1CNCC2
CAN: Nc1cccc2c1CNCC2

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzk

CHI: COC(=O)CC[N@H+](CC1)Cc(c12)cccc2 lig_4WP_unprep_step2.mol2
MAE: COC(=O)CCN(CC1)Cc(c12)cccc2

UNCHARGED
CHI: COC(=O)CCN1CCc2ccccc2C1
MAE: COC(=O)CCN(CC1)Cc(c12)cccc2

UNCHARGED CANONICAL/ISOMERIC
CHI: COC(=O)CCN1CCc2ccccc2C1
MAE: COC(=O)CCN1CCc2ccccc2C1

ORG: COC(=O)CCN1CCC2=CC=CC=C2C1
CAN: COC(=O)CCN1CCc2ccccc2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzt

CHI: COCC(COC)[N@H+](C1)CCc(c12)cccc2N lig_4XJ_unprep_step2.mol2
MAE: COCC(COC)N(C1)CCc(c12)cccc2N

UNCHARGED
CHI: COCC(COC)N1CCc2cccc(N)c2C1
MAE: COCC(COC)N(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: COCC(COC)N1CCc2cccc(N)c2C1
MAE: COCC(COC)N1CCc2cccc(N)c2C1

ORG: COCC(COC)N1CCC2=C(C1)C(N)=CC=C2
CAN: COCC(COC)N1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5hvt

CHI: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 lig_NVS_unprep_step2.mol2
MAE: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3

UNCHARGED
CHI: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3
MAE: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3

UNCHARGED CANONICAL/ISOMERIC
CHI: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1
MAE: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1

ORG: COC1=CC=C(N2CC3=C(C=C(O)C=C3)OC2=O)C=C1
CAN: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzo

CHI: C\N=C(O)\CC[N@@H+](C1)CCc(c12)cccc2N lig_4XC_unprep_step2.mol2
MAE: C/N=C(O)\CCN(C1)CCc(c12)cccc2N

UNCHARGED
CHI: C/N=C(\O)CCN1CCc2cccc(N)c2C1
MAE: C/N=C(O)\CCN(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: C/N=C(\O)CCN1CCc2cccc(N)c2C1
MAE: C/N=C(/O)CCN1CCc2cccc(N)c2C1

ORG: CN=C(O)CCN1CCC2=C(C1)C(N)=CC=C2
CAN: CN=C(O)CCN1CCc2cccc(N)c2C1

CHIMERA DIFFERS
MAESTRO DIFFERS
==============================

5bzs

CHI: C1COCCC1OCCC[N@@H+](CC2)Cc(c23)cccc3N lig_4XK_unprep_step2.mol2
MAE: C1COCCC1OCCCN(CC2)Cc(c23)cccc3N

UNCHARGED
CHI: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2
MAE: C1COCCC1OCCCN(CC2)Cc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2
MAE: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2

ORG: NC1=CC=CC2=C1CCN(CCCOC1CCOCC1)C2
CAN: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5h9p

CHI: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@H]([C@@H](O)[C@H]4CO)n(nn5)cc5-c6cc(F)ccc6 lig_TD2_unprep_step2.mol2
MAE: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@H]([C@@H](O)[C@H]4CO)n(nn5)cc5-c6cc(F)ccc6

UNCHARGED
CHI: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@H]([C@@H](O)[C@H]4CO)n(nn5)cc5-c6cc(F)ccc6
MAE: c1ccc(F)cc1-c2cn(nn2)[C@@H]([C@@H](O)[C@H]3CO)[C@@H](O)[C@@H](O3)S[C@H](O4)[C@H](O)[C@H]([C@@H](O)[C@H]4CO)n(nn5)cc5-c6cc(F)ccc6

UNCHARGED CANONICAL/ISOMERIC
CHI: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](n2cc(-c3cccc(F)c3)nn2)[C@H]1O
MAE: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](n2cc(-c3cccc(F)c3)nn2)[C@H]1O

ORG: [H][C@@]1(S[C@]2([H])O[C@]([H])(CO)[C@]([H])(O)[C@]([H])(N3C=C(C4=CC(F)=CC=C4)N=N3)[C@@]2([H])O)O[C@]([H])(CO)[C@]([H])(O)[C@]([H])(N2C=C(C3=CC(F)=CC=C3)N=N2)[C@@]1([H])O
CAN: OC[C@H]1O[C@@H](S[C@@H]2O[C@H](CO)[C@H](O)[C@H](n3cc(-c4cccc(F)c4)nn3)[C@H]2O)[C@H](O)[C@@H](n2cc(-c3cccc(F)c3)nn2)[C@H]1O

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5ii2

CHI: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 lig_LU2_unprep_step2.mol2
MAE: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3

UNCHARGED
CHI: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3
MAE: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3

UNCHARGED CANONICAL/ISOMERIC
CHI: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12
MAE: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12

ORG: O=C1C=C(C2=CC(O)=C(O)C=C2)OC2=CC(O)=CC(O)=C12
CAN: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5kcv

CHI: C1CCC1([NH3+])c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6 lig_6S1_unprep_step2.mol2
MAE: C1CCC1(N)c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6

UNCHARGED
CHI: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1
MAE: C1CCC1(N)c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1
MAE: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1

ORG: NC1=C(C2=NC3=C(N=C(C4=CC=CC=C4)C=C3)N2C2=CC=C(C3(N)CCC3)C=C2)C=CC=N1
CAN: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzm

CHI: Nc1ccnc(c1C)C[N@H+](C2)CCc(c23)cccc3N lig_4X8_unprep_step2.mol2
MAE: Nc1ccnc(c1C)CN(C2)CCc(c23)cccc3N

UNCHARGED
CHI: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1
MAE: Nc1ccnc(c1C)CN(C2)CCc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1
MAE: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1

ORG: CC1=C(N)C=CN=C1CN1CCC2=C(C1)C(N)=CC=C2
CAN: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzp

CHI: C[NH+](C)CCC[N@@H+](C1)CCc(c12)cccc2N lig_4XG_unprep_step2.mol2
MAE: CN(C)CCCN(C1)CCc(c12)cccc2N

UNCHARGED
CHI: CN(C)CCCN1CCc2cccc(N)c2C1
MAE: CN(C)CCCN(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: CN(C)CCCN1CCc2cccc(N)c2C1
MAE: CN(C)CCCN1CCc2cccc(N)c2C1

ORG: CN(C)CCCN1CCC2=C(C1)C(N)=CC=C2
CAN: CN(C)CCCN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5kix

CHI: O=P(O)(O)OC[C@@H]1[C@@H](OP(=O)(O)O)C[C@@H](O1)n(c2)c(=O)nc(O)c2C lig_THP_unprep_step2.mol2
MAE: O=P(O)(O)OC[C@@H]1[C@@H](OP(=O)(O)O)C[C@@H](O1)n(c2)c(=O)nc(O)c2C

UNCHARGED
CHI: O=P(O)(O)OC[C@@H]1[C@@H](OP(=O)(O)O)C[C@@H](O1)n(c2)c(=O)nc(O)c2C
MAE: O=P(O)(O)OC[C@@H]1[C@@H](OP(=O)(O)O)C[C@@H](O1)n(c2)c(=O)nc(O)c2C

UNCHARGED CANONICAL/ISOMERIC
CHI: Cc1cn([C@H]2C[C@H](OP(=O)(O)O)[C@@H](COP(=O)(O)O)O2)c(=O)nc1O
MAE: Cc1cn([C@H]2C[C@H](OP(=O)(O)O)[C@@H](COP(=O)(O)O)O2)c(=O)nc1O

ORG: [H][C@]1(OP(=O)(O)O)C[C@]([H])(N2C=C(C)C(O)=NC2=O)O[C@]1([H])COP(=O)(O)O
CAN: Cc1cn([C@H]2C[C@H](OP(=O)(O)O)[C@@H](COP(=O)(O)O)O2)c(=O)nc1O

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzr

CHI: C1COCCC1OCCC[N@H+](C2)CCc(c23)cccc3N lig_4XM_unprep_step2.mol2
MAE: C1COCCC1OCCCN(C2)CCc(c23)cccc3N

UNCHARGED
CHI: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2
MAE: C1COCCC1OCCCN(C2)CCc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2
MAE: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2

ORG: NC1=CC=CC2=C1CN(CCCOC1CCOCC1)CC2
CAN: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzf

CHI: CC[N@@H+](CC1)Cc(c12)cccc2N lig_4X3_unprep_step2.mol2
MAE: CCN(CC1)Cc(c12)cccc2N

UNCHARGED
CHI: CCN1CCc2c(N)cccc2C1
MAE: CCN(CC1)Cc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: CCN1CCc2c(N)cccc2C1
MAE: CCN1CCc2c(N)cccc2C1

ORG: CCN1CCC2=C(C=CC=C2N)C1
CAN: CCN1CCc2c(N)cccc2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzj

CHI: Cn1ccnc1CC[N@@H+](C2)CCc(c23)cccc3N lig_4WN_unprep_step2.mol2
MAE: Cn1ccnc1CCN(C2)CCc(c23)cccc3N

UNCHARGED
CHI: Cn1ccnc1CCN1CCc2cccc(N)c2C1
MAE: Cn1ccnc1CCN(C2)CCc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Cn1ccnc1CCN1CCc2cccc(N)c2C1
MAE: Cn1ccnc1CCN1CCc2cccc(N)c2C1

ORG: CN1C=CN=C1CCN1CCC2=C(C1)C(N)=CC=C2
CAN: Cn1ccnc1CCN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzh

CHI: [NH3+]CCC[N@@H+](CC1)Cc(c12)cccc2 lig_4X1_unprep_step2.mol2
MAE: NCCCN(CC1)Cc(c12)cccc2

UNCHARGED
CHI: NCCCN1CCc2ccccc2C1
MAE: NCCCN(CC1)Cc(c12)cccc2

UNCHARGED CANONICAL/ISOMERIC
CHI: NCCCN1CCc2ccccc2C1
MAE: NCCCN1CCc2ccccc2C1

ORG: NCCCN1CCC2=CC=CC=C2C1
CAN: NCCCN1CCc2ccccc2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzc

CHI: CC1(C)C[NH2+]Cc(c12)cccc2 lig_4WT_unprep_step2.mol2
MAE: CC1(C)CNCc(c12)cccc2

UNCHARGED
CHI: CC1(C)CNCc2ccccc21
MAE: CC1(C)CNCc(c12)cccc2

UNCHARGED CANONICAL/ISOMERIC
CHI: CC1(C)CNCc2ccccc21
MAE: CC1(C)CNCc2ccccc21

ORG: CC1(C)CNCC2=CC=CC=C21
CAN: CC1(C)CNCc2ccccc21

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5f9b

CHI: C1CC(C)(C)C[C@H]([C@]12C(=O)O)C=3[C@@](C)(C[C@H]2O)[C@]4(C)[C@H](CC3)[C@]5(C)[C@@H](CC4)[C@](C)(CO)[C@@H](O)CC5 lig_5VN_unprep_step2.mol2
MAE: C1CC(C)(C)C[C@H]([C@]12C(=O)O)C=3[C@@](C)(C[C@H]2O)[C@@]4(C)[C@H](CC3)[C@]5(C)[C@@H](CC4)[C@](C)(CO)[C@@H](O)CC5

UNCHARGED
CHI: C1CC(C)(C)C[C@H]([C@]12C(=O)O)C=3[C@@](C)(C[C@H]2O)[C@]4(C)[C@H](CC3)[C@]5(C)[C@@H](CC4)[C@](C)(CO)[C@@H](O)CC5
MAE: C1CC(C)(C)C[C@H]([C@]12C(=O)O)C=3[C@@](C)(C[C@H]2O)[C@@]4(C)[C@H](CC3)[C@]5(C)[C@@H](CC4)[C@](C)(CO)[C@@H](O)CC5

UNCHARGED CANONICAL/ISOMERIC
CHI: CC1(C)CC[C@@]2(C(=O)O)[C@@H](C1)C1=CC[C@H]3[C@](C)(CC[C@H]4[C@](C)(CO)[C@@H](O)CC[C@]34C)[C@]1(C)C[C@H]2O
MAE: CC1(C)CC[C@@]2(C(=O)O)[C@@H](C1)C1=CC[C@@H]3[C@@]4(C)CC[C@H](O)[C@@](C)(CO)[C@@H]4CC[C@@]3(C)[C@]1(C)C[C@H]2O

ORG: [H][C@]1(O)CC[C@@]2(C)[C@@]([H])(CC[C@@]3(C)[C@]4(C)C[C@@]([H])(O)[C@@]5(C(=O)O)CCC(C)(C)C[C@@]5([H])C4=CC[C@@]32[H])[C@]1(C)CO
CAN: CC1(C)CC[C@@]2(C(=O)O)[C@@H](C1)C1=CC[C@@H]3[C@@]4(C)CC[C@H](O)[C@@](C)(CO)[C@@H]4CC[C@@]3(C)[C@]1(C)C[C@H]2O

CHIMERA DIFFERS
MAESTRO MATCHES
==============================

5iid

CHI: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 lig_6BK_unprep_step2.mol2
MAE: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3

UNCHARGED
CHI: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3
MAE: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3

UNCHARGED CANONICAL/ISOMERIC
CHI: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12
MAE: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12

ORG: O=C1C=C(C2=CC(O)=C(O)C=C2)OC2=CC=CC(O)=C12
CAN: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12

CHIMERA MATCHES
MAESTRO MATCHES
==============================

4zk6

CHI: O=C(O)c1c(C(=O)[OH-])nccc1 lig_NTM_unprep_step2.mol2
MAE: O=C(O)c1c(C(=O)O)nccc1

UNCHARGED
[18:28:29] Explicit valence for atom # 7 O, 3, is greater than permitted

ERROR. SKIPPING MOL

=============================
1fcz

CHI: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C lig_156_unprep_step2.mol2
MAE: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C

UNCHARGED
CHI: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C
MAE: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C

UNCHARGED CANONICAL/ISOMERIC
CHI: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21
MAE: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21

ORG: [H]/C(C(=O)C1=CC2=C(C=C1)C(C)(C)CCC2(C)C)=C(/[H])C1=CC=C(C(=O)O)C=C1
CAN: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzg

CHI: C1C[N@H+](C)Cc(c12)cccc2N lig_4X6_unprep_step2.mol2
MAE: C1CN(C)Cc(c12)cccc2N

UNCHARGED
CHI: CN1CCc2c(N)cccc2C1
MAE: C1CN(C)Cc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: CN1CCc2c(N)cccc2C1
MAE: CN1CCc2c(N)cccc2C1

ORG: CN1CCC2=C(C=CC=C2N)C1
CAN: CN1CCc2c(N)cccc2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzq

CHI: C1COCC[NH+]1CCC[N@H+](C2)CCc(c23)cccc3N lig_4XL_unprep_step2.mol2
MAE: C1COCCN1CCCN(C2)CCc(c23)cccc3N

UNCHARGED
CHI: Nc1cccc2c1CN(CCCN1CCOCC1)CC2
MAE: C1COCCN1CCCN(C2)CCc(c23)cccc3N

UNCHARGED CANONICAL/ISOMERIC
CHI: Nc1cccc2c1CN(CCCN1CCOCC1)CC2
MAE: Nc1cccc2c1CN(CCCN1CCOCC1)CC2

ORG: NC1=CC=CC2=C1CN(CCCN1CCOCC1)CC2
CAN: Nc1cccc2c1CN(CCCN1CCOCC1)CC2

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5ii1

CHI: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 lig_6BL_unprep_step2.mol2
MAE: Cc1[nH]nc(c12)oc(=O)c3c2cccc3

UNCHARGED
CHI: Cc1[nH]nc(c12)oc(=O)c3c2cccc3
MAE: Cc1[nH]nc(c12)oc(=O)c3c2cccc3

UNCHARGED CANONICAL/ISOMERIC
CHI: Cc1[nH]nc2oc(=O)c3ccccc3c12
MAE: Cc1[nH]nc2oc(=O)c3ccccc3c12

ORG: CC1=C2C3=CC=CC=C3C(=O)OC2=NN1
CAN: Cc1[nH]nc2oc(=O)c3ccccc3c12

CHIMERA MATCHES
MAESTRO MATCHES
==============================

5bzn

CHI: COCC[N@H+](C1)CCc(c12)cccc2N lig_4XD_unprep_step2.mol2
MAE: COCCN(C1)CCc(c12)cccc2N

UNCHARGED
CHI: COCCN1CCc2cccc(N)c2C1
MAE: COCCN(C1)CCc(c12)cccc2N

UNCHARGED CANONICAL/ISOMERIC
CHI: COCCN1CCc2cccc(N)c2C1
MAE: COCCN1CCc2cccc(N)c2C1

ORG: COCCN1CCC2=C(C1)C(N)=CC=C2
CAN: COCCN1CCc2cccc(N)c2C1

CHIMERA MATCHES
MAESTRO MATCHES
==============================

=============================
SUMMARY
=============================
CHIMERA MATCHES: 25
MAESTRO MATCHES: 26
MOLECULES ATTEMPTED: 29
mkgilson commented 7 years ago

Hi Jeff,

What is your plan for the open-source flow, then? Sorry if I missed this.

Is it worth emailing the chimera folks? I know Scooter Morris there, and h might be able to fix it.

MIke

On 7/28/2016 6:38 PM, j-wags wrote:

Summary: I've re-tested using the most current versions of the code. The new version of rdkit/chimera_proteinligprep.py gets 27/29 cases correct (it continues to pull a hydroxyl radical off of the 4zk6 ligand, and generates incorrect steroid chirality on 5f9b). Maestro now appears to get 29/29 ligand generation cases right. I'm closing this ticket for now and opening a more concise ticket for 4zk6 and 5f9b

Detailed data dump: Note that 3 "failures" are incorrect and due to peptide bonds, and that chimera_proteinligprep messes up so badly on 4zk6 that the scorer skips it (that's why only 29 molecules are attempted)

|j5wagner@nif1$ pwd /var/home/j5wagner/2016_06_30_bakerLabOHPTest/week26Test_jul27 j5wagner@nif1$ /var/home/j5wagner/miniconda2/bin/python compareToMaestro.py 5a5d CHI: c1ccc(O)c(O)c1/C(O)=N/CCCCC\N=C(O)\c2c(O)c(O)ccc2 lig_5LC_unprep_step2.mol2 MAE: c1ccc(O)c(O)c1/C(O)=N\CCCCC/N=C(O)\c2c(O)c(O)ccc2 UNCHARGED CHI: c1ccc(O)c(O)c1/C(O)=N/CCCCC\N=C(O)\c2c(O)c(O)ccc2 MAE: c1ccc(O)c(O)c1/C(O)=N\CCCCC/N=C(O)\c2c(O)c(O)ccc2 UNCHARGED CANONICAL/ISOMERIC CHI: O/C(=N\CCCCC/N=C(\O)c1cccc(O)c1O)c1cccc(O)c1O MAE: O/C(=N/CCCCC/N=C(/O)c1cccc(O)c1O)c1cccc(O)c1O ORG: OC1=CC=CC(C(O)=NCCCCCN=C(O)C2=C(O)C(O)=CC=C2)=C1O CAN: OC(=NCCCCCN=C(O)c1cccc(O)c1O)c1cccc(O)c1O CHIMERA DIFFERS MAESTRO DIFFERS ============================== 5c5a CHI: OC1=NCCN(C1)C(=O)N2C@HC@HN=C2c5c(OC(C)C)cc(cc5)OC lig_NUT_unprep_step2.mol2 MAE: OC1=NCCN(C1)C(=O)N2C@HC@HN=C2c5c(OC(C)C)cc(cc5)OC UNCHARGED CHI: OC1=NCCN(C1)C(=O)N2C@HC@HN=C2c5c(OC(C)C)cc(cc5)OC MAE: OC1=NCCN(C1)C(=O)N2C@HC@HN=C2c5c(OC(C)C)cc(cc5)OC UNCHARGED CANONICAL/ISOMERIC CHI: COc1ccc(C2=NC@@HC@@HN2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1 MAE: COc1ccc(C2=NC@@HC@@HN2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1 ORG: [H][C@@]1(C2=CC=C(Cl)C=C2)N=C(C2=C(OC(C)C)C=C(OC)C=C2)N(C(=O)N2CCN=C(O)C2)[C@]1([H])C1=CC=C(Cl)C=C1 CAN: COc1ccc(C2=NC@@HC@@HN2C(=O)N2CCN=C(O)C2)c(OC(C)C)c1 CHIMERA MATCHES MAESTRO MATCHES ============================== 4ym9 CHI: CCC(CC)(CO)/C(O)=N/c(cn1)ccc1C lig_4E4_unprep_step2.mol2 MAE: CCC(CC)(CO)/C(O)=N\c(cn1)ccc1C UNCHARGED CHI: CCC(CC)(CO)/C(O)=N/c(cn1)ccc1C MAE: CCC(CC)(CO)/C(O)=N\c(cn1)ccc1C UNCHARGED CANONICAL/ISOMERIC CHI: CCC(CC)(CO)/C(O)=N/c1ccc(C)nc1 MAE: CCC(CC)(CO)/C(O)=N\c1ccc(C)nc1 ORG: CCC(CC)(CO)C(O)=NC1=CN=C(C)C=C1 CAN: CCC(CC)(CO)C(O)=Nc1ccc(C)nc1 CHIMERA DIFFERS MAESTRO DIFFERS ============================== 5h9r CHI: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@@HC@@H[C@H]4CO lig_TGZ_unprep_step2.mol2 MAE: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@@HC@@H[C@H]4CO UNCHARGED CHI: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@@HC@@H[C@H]4CO MAE: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@@HC@@H[C@H]4CO UNCHARGED CANONICAL/ISOMERIC CHI: OC[C@H]1OC@@HC@HC@@H[C@H]1O MAE: OC[C@H]1OC@@HC@HC@@H[C@H]1O ORG: [H][C@@]1(S[C@]2([H])OC@(CO)C@(O)C@(N3C=C(C4=CC(F)=CC=C4)N=N3)[C@@]2([H])O)OC@(CO)C@(O)C@(O)[C@@]1([H])O CAN: OC[C@H]1OC@@HC@HC@@H[C@H]1O CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzl CHI: COC(=O)CCN@H+CCc(c12)cccc2N lig_4WO_unprep_step2.mol2 MAE: COC(=O)CCN(C1)CCc(c12)cccc2N UNCHARGED CHI: COC(=O)CCN1CCc2cccc(N)c2C1 MAE: COC(=O)CCN(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: COC(=O)CCN1CCc2cccc(N)c2C1 MAE: COC(=O)CCN1CCc2cccc(N)c2C1 ORG: COC(=O)CCN1CCC2=C(C1)C(N)=CC=C2 CAN: COC(=O)CCN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzi CHI: C1[NH2+]CCc(c12)cccc2N lig_4WU_unprep_step2.mol2 MAE: C1NCCc(c12)cccc2N UNCHARGED CHI: Nc1cccc2c1CNCC2 MAE: C1NCCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: Nc1cccc2c1CNCC2 MAE: Nc1cccc2c1CNCC2 ORG: NC1=CC=CC2=C1CNCC2 CAN: Nc1cccc2c1CNCC2 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzk CHI: COC(=O)CCN@H+Cc(c12)cccc2 lig_4WP_unprep_step2.mol2 MAE: COC(=O)CCN(CC1)Cc(c12)cccc2 UNCHARGED CHI: COC(=O)CCN1CCc2ccccc2C1 MAE: COC(=O)CCN(CC1)Cc(c12)cccc2 UNCHARGED CANONICAL/ISOMERIC CHI: COC(=O)CCN1CCc2ccccc2C1 MAE: COC(=O)CCN1CCc2ccccc2C1 ORG: COC(=O)CCN1CCC2=CC=CC=C2C1 CAN: COC(=O)CCN1CCc2ccccc2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzt CHI: COCC(COC)N@H+CCc(c12)cccc2N lig_4XJ_unprep_step2.mol2 MAE: COCC(COC)N(C1)CCc(c12)cccc2N UNCHARGED CHI: COCC(COC)N1CCc2cccc(N)c2C1 MAE: COCC(COC)N(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: COCC(COC)N1CCc2cccc(N)c2C1 MAE: COCC(COC)N1CCc2cccc(N)c2C1 ORG: COCC(COC)N1CCC2=C(C1)C(N)=CC=C2 CAN: COCC(COC)N1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5hvt CHI: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 lig_NVS_unprep_step2.mol2 MAE: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 UNCHARGED CHI: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 MAE: COc(cc1)ccc1N(C2)C(=O)Oc(c23)cc(O)cc3 UNCHARGED CANONICAL/ISOMERIC CHI: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1 MAE: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1 ORG: COC1=CC=C(N2CC3=C(C=C(O)C=C3)OC2=O)C=C1 CAN: COc1ccc(N2Cc3ccc(O)cc3OC2=O)cc1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzo CHI: C\N=C(O)\CCN@@H+CCc(c12)cccc2N lig_4XC_unprep_step2.mol2 MAE: C/N=C(O)\CCN(C1)CCc(c12)cccc2N UNCHARGED CHI: C/N=C(\O)CCN1CCc2cccc(N)c2C1 MAE: C/N=C(O)\CCN(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: C/N=C(\O)CCN1CCc2cccc(N)c2C1 MAE: C/N=C(/O)CCN1CCc2cccc(N)c2C1 ORG: CN=C(O)CCN1CCC2=C(C1)C(N)=CC=C2 CAN: CN=C(O)CCN1CCc2cccc(N)c2C1 CHIMERA DIFFERS MAESTRO DIFFERS ============================== 5bzs CHI: C1COCCC1OCCCN@@H+Cc(c23)cccc3N lig_4XK_unprep_step2.mol2 MAE: C1COCCC1OCCCN(CC2)Cc(c23)cccc3N UNCHARGED CHI: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2 MAE: C1COCCC1OCCCN(CC2)Cc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2 MAE: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2 ORG: NC1=CC=CC2=C1CCN(CCCOC1CCOCC1)C2 CAN: Nc1cccc2c1CCN(CCCOC1CCOCC1)C2 CHIMERA MATCHES MAESTRO MATCHES ============================== 5h9p CHI: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@Hn(nn5)cc5-c6cc(F)ccc6 lig_TD2_unprep_step2.mol2 MAE: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@Hn(nn5)cc5-c6cc(F)ccc6 UNCHARGED CHI: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@Hn(nn5)cc5-c6cc(F)ccc6 MAE: c1ccc(F)cc1-c2cn(nn2)C@@HC@@HC@@HSC@HC@HC@Hn(nn5)cc5-c6cc(F)ccc6 UNCHARGED CANONICAL/ISOMERIC CHI: OC[C@H]1OC@@HC@HC@@H[C@H]1O MAE: OC[C@H]1OC@@HC@HC@@H[C@H]1O ORG: [H][C@@]1(S[C@]2([H])OC@(CO)C@(O)C@(N3C=C(C4=CC(F)=CC=C4)N=N3)[C@@]2([H])O)OC@(CO)C@(O)C@(N2C=C(C3=CC(F)=CC=C3)N=N2)[C@@]1([H])O CAN: OC[C@H]1OC@@HC@HC@@H[C@H]1O CHIMERA MATCHES MAESTRO MATCHES ============================== 5ii2 CHI: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 lig_LU2_unprep_step2.mol2 MAE: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 UNCHARGED CHI: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 MAE: Oc1cc(O)cc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 UNCHARGED CANONICAL/ISOMERIC CHI: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12 MAE: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12 ORG: O=C1C=C(C2=CC(O)=C(O)C=C2)OC2=CC(O)=CC(O)=C12 CAN: O=c1cc(-c2ccc(O)c(O)c2)oc2cc(O)cc(O)c12 CHIMERA MATCHES MAESTRO MATCHES ============================== 5kcv CHI: C1CCC1([NH3+])c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6 lig_6S1_unprep_step2.mol2 MAE: C1CCC1(N)c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6 UNCHARGED CHI: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1 MAE: C1CCC1(N)c2ccc(cc2)-n(c(n3)-c4c(N)nccc4)c(c35)nc(cc5)-c6ccccc6 UNCHARGED CANONICAL/ISOMERIC CHI: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1 MAE: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1 ORG: NC1=C(C2=NC3=C(N=C(C4=CC=CC=C4)C=C3)N2C2=CC=C(C3(N)CCC3)C=C2)C=CC=N1 CAN: Nc1ncccc1-c1nc2ccc(-c3ccccc3)nc2n1-c1ccc(C2(N)CCC2)cc1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzm CHI: Nc1ccnc(c1C)CN@H+CCc(c23)cccc3N lig_4X8_unprep_step2.mol2 MAE: Nc1ccnc(c1C)CN(C2)CCc(c23)cccc3N UNCHARGED CHI: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1 MAE: Nc1ccnc(c1C)CN(C2)CCc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1 MAE: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1 ORG: CC1=C(N)C=CN=C1CN1CCC2=C(C1)C(N)=CC=C2 CAN: Cc1c(N)ccnc1CN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzp CHI: CNH+CCCN@@H+CCc(c12)cccc2N lig_4XG_unprep_step2.mol2 MAE: CN(C)CCCN(C1)CCc(c12)cccc2N UNCHARGED CHI: CN(C)CCCN1CCc2cccc(N)c2C1 MAE: CN(C)CCCN(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: CN(C)CCCN1CCc2cccc(N)c2C1 MAE: CN(C)CCCN1CCc2cccc(N)c2C1 ORG: CN(C)CCCN1CCC2=C(C1)C(N)=CC=C2 CAN: CN(C)CCCN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5kix CHI: O=P(O)(O)OC[C@@H]1C@@HCC@@Hn(c2)c(=O)nc(O)c2C lig_THP_unprep_step2.mol2 MAE: O=P(O)(O)OC[C@@H]1C@@HCC@@Hn(c2)c(=O)nc(O)c2C UNCHARGED CHI: O=P(O)(O)OC[C@@H]1C@@HCC@@Hn(c2)c(=O)nc(O)c2C MAE: O=P(O)(O)OC[C@@H]1C@@HCC@@Hn(c2)c(=O)nc(O)c2C UNCHARGED CANONICAL/ISOMERIC CHI: Cc1cn([C@H]2CC@HC@@HO2)c(=O)nc1O MAE: Cc1cn([C@H]2CC@HC@@HO2)c(=O)nc1O ORG: [H][C@]1(OP(=O)(O)O)CC@(N2C=C(C)C(O)=NC2=O)O[C@]1([H])COP(=O)(O)O CAN: Cc1cn([C@H]2CC@HC@@HO2)c(=O)nc1O CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzr CHI: C1COCCC1OCCCN@H+CCc(c23)cccc3N lig_4XM_unprep_step2.mol2 MAE: C1COCCC1OCCCN(C2)CCc(c23)cccc3N UNCHARGED CHI: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2 MAE: C1COCCC1OCCCN(C2)CCc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2 MAE: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2 ORG: NC1=CC=CC2=C1CN(CCCOC1CCOCC1)CC2 CAN: Nc1cccc2c1CN(CCCOC1CCOCC1)CC2 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzf CHI: CCN@@H+Cc(c12)cccc2N lig_4X3_unprep_step2.mol2 MAE: CCN(CC1)Cc(c12)cccc2N UNCHARGED CHI: CCN1CCc2c(N)cccc2C1 MAE: CCN(CC1)Cc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: CCN1CCc2c(N)cccc2C1 MAE: CCN1CCc2c(N)cccc2C1 ORG: CCN1CCC2=C(C=CC=C2N)C1 CAN: CCN1CCc2c(N)cccc2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzj CHI: Cn1ccnc1CCN@@H+CCc(c23)cccc3N lig_4WN_unprep_step2.mol2 MAE: Cn1ccnc1CCN(C2)CCc(c23)cccc3N UNCHARGED CHI: Cn1ccnc1CCN1CCc2cccc(N)c2C1 MAE: Cn1ccnc1CCN(C2)CCc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Cn1ccnc1CCN1CCc2cccc(N)c2C1 MAE: Cn1ccnc1CCN1CCc2cccc(N)c2C1 ORG: CN1C=CN=C1CCN1CCC2=C(C1)C(N)=CC=C2 CAN: Cn1ccnc1CCN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzh CHI: [NH3+]CCCN@@H+Cc(c12)cccc2 lig_4X1_unprep_step2.mol2 MAE: NCCCN(CC1)Cc(c12)cccc2 UNCHARGED CHI: NCCCN1CCc2ccccc2C1 MAE: NCCCN(CC1)Cc(c12)cccc2 UNCHARGED CANONICAL/ISOMERIC CHI: NCCCN1CCc2ccccc2C1 MAE: NCCCN1CCc2ccccc2C1 ORG: NCCCN1CCC2=CC=CC=C2C1 CAN: NCCCN1CCc2ccccc2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzc CHI: CC1(C)C[NH2+]Cc(c12)cccc2 lig_4WT_unprep_step2.mol2 MAE: CC1(C)CNCc(c12)cccc2 UNCHARGED CHI: CC1(C)CNCc2ccccc21 MAE: CC1(C)CNCc(c12)cccc2 UNCHARGED CANONICAL/ISOMERIC CHI: CC1(C)CNCc2ccccc21 MAE: CC1(C)CNCc2ccccc21 ORG: CC1(C)CNCC2=CC=CC=C21 CAN: CC1(C)CNCc2ccccc21 CHIMERA MATCHES MAESTRO MATCHES ============================== 5f9b CHI: C1CC(C)(C)CC@HC=3C@@(C[C@H]2O)[C@]4(C)C@H[C@]5(C)C@@HC@(CO)C@@HCC5 lig_5VN_unprep_step2.mol2 MAE: C1CC(C)(C)CC@HC=3C@@(C[C@H]2O)[C@@]4(C)C@H[C@]5(C)C@@HC@(CO)C@@HCC5 UNCHARGED CHI: C1CC(C)(C)CC@HC=3C@@(C[C@H]2O)[C@]4(C)C@H[C@]5(C)C@@HC@(CO)C@@HCC5 MAE: C1CC(C)(C)CC@HC=3C@@(C[C@H]2O)[C@@]4(C)C@H[C@]5(C)C@@HC@(CO)C@@HCC5 UNCHARGED CANONICAL/ISOMERIC CHI: CC1(C)CC[C@@]2(C(=O)O)C@@HC1=CC[C@H]3C@(CC[C@H]4C@(CO)C@@HCC[C@]34C)[C@]1(C)C[C@H]2O MAE: CC1(C)CC[C@@]2(C(=O)O)C@@HC1=CC[C@@H]3[C@@]4(C)CCC@HC@@(CO)[C@@H]4CC[C@@]3(C)[C@]1(C)C[C@H]2O ORG: [H][C@]1(O)CC[C@@]2(C)C@@(CC[C@@]3(C)[C@]4(C)CC@@(O)[C@@]5(C(=O)O)CCC(C)(C)C[C@@]5([H])C4=CC[C@@]32[H])[C@]1(C)CO CAN: CC1(C)CC[C@@]2(C(=O)O)C@@HC1=CC[C@@H]3[C@@]4(C)CCC@HC@@(CO)[C@@H]4CC[C@@]3(C)[C@]1(C)C[C@H]2O CHIMERA DIFFERS MAESTRO MATCHES ============================== 5iid CHI: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 lig_6BK_unprep_step2.mol2 MAE: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 UNCHARGED CHI: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 MAE: Oc1cccc(c12)oc(cc2=O)-c3cc(O)c(O)cc3 UNCHARGED CANONICAL/ISOMERIC CHI: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12 MAE: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12 ORG: O=C1C=C(C2=CC(O)=C(O)C=C2)OC2=CC=CC(O)=C12 CAN: O=c1cc(-c2ccc(O)c(O)c2)oc2cccc(O)c12 CHIMERA MATCHES MAESTRO MATCHES ============================== 4zk6 CHI: O=C(O)c1c(C(=O)[OH-])nccc1 lig_NTM_unprep_step2.mol2 MAE: O=C(O)c1c(C(=O)O)nccc1 UNCHARGED [18:28:29] Explicit valence for atom # 7 O, 3, is greater than permitted ERROR. SKIPPING MOL ============================= 1fcz CHI: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C lig_156_unprep_step2.mol2 MAE: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C UNCHARGED CHI: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C MAE: O=C(O)c1ccc(cc1)\C=C\C(=O)c(c2)ccc(c23)C(C)(C)CCC3(C)C UNCHARGED CANONICAL/ISOMERIC CHI: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21 MAE: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21 ORG: [H]/C(C(=O)C1=CC2=C(C=C1)C(C)(C)CCC2(C)C)=C(/[H])C1=CC=C(C(=O)O)C=C1 CAN: CC1(C)CCC(C)(C)c2cc(C(=O)/C=C/c3ccc(C(=O)O)cc3)ccc21 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzg CHI: C1CN@H+Cc(c12)cccc2N lig_4X6_unprep_step2.mol2 MAE: C1CN(C)Cc(c12)cccc2N UNCHARGED CHI: CN1CCc2c(N)cccc2C1 MAE: C1CN(C)Cc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: CN1CCc2c(N)cccc2C1 MAE: CN1CCc2c(N)cccc2C1 ORG: CN1CCC2=C(C=CC=C2N)C1 CAN: CN1CCc2c(N)cccc2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzq CHI: C1COCC[NH+]1CCCN@H+CCc(c23)cccc3N lig_4XL_unprep_step2.mol2 MAE: C1COCCN1CCCN(C2)CCc(c23)cccc3N UNCHARGED CHI: Nc1cccc2c1CN(CCCN1CCOCC1)CC2 MAE: C1COCCN1CCCN(C2)CCc(c23)cccc3N UNCHARGED CANONICAL/ISOMERIC CHI: Nc1cccc2c1CN(CCCN1CCOCC1)CC2 MAE: Nc1cccc2c1CN(CCCN1CCOCC1)CC2 ORG: NC1=CC=CC2=C1CN(CCCN1CCOCC1)CC2 CAN: Nc1cccc2c1CN(CCCN1CCOCC1)CC2 CHIMERA MATCHES MAESTRO MATCHES ============================== 5ii1 CHI: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 lig_6BL_unprep_step2.mol2 MAE: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 UNCHARGED CHI: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 MAE: Cc1[nH]nc(c12)oc(=O)c3c2cccc3 UNCHARGED CANONICAL/ISOMERIC CHI: Cc1[nH]nc2oc(=O)c3ccccc3c12 MAE: Cc1[nH]nc2oc(=O)c3ccccc3c12 ORG: CC1=C2C3=CC=CC=C3C(=O)OC2=NN1 CAN: Cc1[nH]nc2oc(=O)c3ccccc3c12 CHIMERA MATCHES MAESTRO MATCHES ============================== 5bzn CHI: COCCN@H+CCc(c12)cccc2N lig_4XD_unprep_step2.mol2 MAE: COCCN(C1)CCc(c12)cccc2N UNCHARGED CHI: COCCN1CCc2cccc(N)c2C1 MAE: COCCN(C1)CCc(c12)cccc2N UNCHARGED CANONICAL/ISOMERIC CHI: COCCN1CCc2cccc(N)c2C1 MAE: COCCN1CCc2cccc(N)c2C1 ORG: COCCN1CCC2=C(C1)C(N)=CC=C2 CAN: COCCN1CCc2cccc(N)c2C1 CHIMERA MATCHES MAESTRO MATCHES ============================== ============================= SUMMARY ============================= CHIMERA MATCHES: 25 MAESTRO MATCHES: 26 MOLECULES ATTEMPTED: 29 |

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/drugdata/D3R/issues/51#issuecomment-236073009, or mute the thread https://github.com/notifications/unsubscribe-auth/AQEJQOD9qsZjR0SOJ7TNA3GZFZZV0BnUks5qaVmbgaJpZM4JEjlJ.