openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
311 stars 91 forks source link

Proton transfer in antechamber AM1 partial charge calculation #924

Open j-wags opened 3 years ago

j-wags commented 3 years ago

Describe the bug originally posted by Bill Swope

Alberto and I have been looking at dipole moments for our neutral benchmark molecules and have discovered several issues that we feel should be examined more closely. We compared b3lyp-d3bj dipole moments with those implied by the charge model of the openff-1.3.0 force field. One generally expects the force field to have a dipole moment greater than what one would see in the gas phase. However, looking at the magnitude of the dipole moment, about 30% of the molecules have force field charge models that are depolarized relative to b3lyp, with 8% significantly so (dipole moment magnitude less than that of b3lyp by more than 1Debye). In terms of the direction of the dipole, about 8% of the molecules have dipole vectors that point in different directions by at least 40 degrees from the b3lyp dipoles. We saw a number of surprises with zwitter ions, where, in one case, b3lyp has a dipole of 30D, whereas the force field has 13D. In this case the charge on a -(NH2)- group (protonated nitrogen) is negative instead of positive. The same molecule with a charge model from MMFF94X, as implemented in MOE, gives a dipole that is very close to the b3lyp result, and has a positive charge on that -(NH2)- group.

Methodology details: The b3lyp dipole vectors for each conformer are in the corresponding json files (thanks David Dotson), as are the smiles strings that are used with the openff-toolkit (thanks Jeff Wagner) to obtain the openff-1.3.0 charge models. We place these force field charges on the atomic sites of the quantum optimized structure and compute the dipole vectors. Using the same molecular structure for computing the force field dipole vector as was used for the b3lyp dipole vector allows us to compare the dipoles directly, without having to align structures. The sdf file will follow:

ZwitterIon.sdf.gz

j-wags commented 3 years ago

Working on debugging this, first by checking the outputs of OE and AT for consistency -- To get the charges using OpenEye and AmberTools backends:

from openff.toolkit.topology import Molecule
mol = Molecule.from_file('ZwitterIon.sdf')

from openff.toolkit.utils.toolkits import OpenEyeToolkitWrapper, AmberToolsToolkitWrapper
import copy
mol.assign_partial_charges(partial_charge_method='AM1BCC', toolkit_registry=OpenEyeToolkitWrapper())
oe_charges = copy.deepcopy(mol.partial_charges)
mol.assign_partial_charges(partial_charge_method='AM1BCC', toolkit_registry=AmberToolsToolkitWrapper())
at_charges = copy.deepcopy(mol.partial_charges)

from simtk import unit
tot_oe_charge = 0 * unit.elementary_charge
tot_at_charge = 0 * unit.elementary_charge

for atom, oe_chg, at_chg in zip(mol.atoms, oe_charges, at_charges):
    print(f'{atom.element.symbol}\t{oe_chg}\t{at_chg}')
    tot_oe_charge += oe_chg
    tot_at_charge += at_chg
print(mol.total_charge, tot_oe_charge, tot_at_charge)

returns

C   -0.0794299989938736 e   -0.082 e
C   -0.14079999923706055 e  -0.155 e
C   -0.10553999990224838 e  -0.092 e
C   -0.15211999416351318 e  -0.149 e
H   0.12118999660015106 e   0.12 e
C   -0.12355999648571014 e  -0.1206 e
C   -0.16052000224590302 e  -0.1613 e
C   0.9110100269317627 e    0.9102 e
C   -0.11248999834060669 e  -0.1084 e
C   -0.11248999834060669 e  -0.1084 e
C   0.11087000370025635 e   0.1108 e
C   0.11087000370025635 e   0.1108 e
C   0.0017099999822676182 e -0.0034 e
N   -0.7648699879646301 e   -0.763 e
O   -0.8147000074386597 e   -0.8113 e
O   -0.8147000074386597 e   -0.8113 e
H   0.17139999568462372 e   0.167 e
H   0.10400000214576721 e   0.093 e
H   0.13977999985218048 e   0.156 e
H   0.09928999841213226 e   0.09395 e
H   0.09928999841213226 e   0.09395 e
H   0.09928999841213226 e   0.09395 e
H   0.09928999841213226 e   0.09395 e
H   0.09339000284671783 e   0.09345 e
H   0.09339000284671783 e   0.09345 e
H   0.09339000284671783 e   0.09345 e
H   0.09339000284671783 e   0.09345 e
H   0.04786999896168709 e   0.0547 e
H   0.44589999318122864 e   0.4468 e
H   0.44589999318122864 e   0.4468 e
0.0 e 2.8405338525772095e-08 e -2.220446049250313e-16 e

So, these look consistent.

To visualize the charges:

from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
from rdkit.Chem.Draw.MolDrawing import DrawingOptions
mol_copy = copy.deepcopy(mol)
mol_copy._conformers = None
rdmol = mol_copy.to_rdkit()
for rdatom, chg in zip(rdmol.GetAtoms(), at_charges):
    rdatom.SetProp('atomLabel', f'{chg/unit.elementary_charge:.2f}')
opts = DrawingOptions()
opts.atomLabelFontSize=5
Draw.MolToImage(rdmol, 
                 options=opts
               )

image

jchodera commented 3 years ago

Converting the SDF file to PDB and running it through antechamber manually also gives the same negative charge on the nitrogen.

This appears to be a true failure of AM1-BCC, rather than an issue with our plumbing accidentally misperceiving the bond orders. I suggest we add this molecule to a set of "interesting molecules" for charge model construction and validation for partial charge model studies.

BillSwope commented 3 years ago

Jeff, first off, these are not the charges I am seeing and we should resolve this right away. My charges come from extracting a smiles string from the json file that was produced during the openff-benchmark optimize execute step. This smiles string is passed to openff-toolkit using the code snippet you gave me.

import sys
import json
from openff.toolkit.topology import Molecule
from openff.toolkit.typing.engines.smirnoff import ForceField

narg=len(sys.argv)
print("Opening json file ",sys.argv[1])

with open(sys.argv[1]) as filename:
    data=json.load(filename)
    nstep=len(data['trajectory'])
    print('Number of steps in trajectory ',nstep)

'''   Dipole at the end of the last step in the optimization trajectory '''
    print('Dipole at last step ',data['trajectory'][nstep-1]['extras']['qcvars']['CURRENT DIPOLE X'],',', \
            data['trajectory'][nstep-1]['extras']['qcvars']['CURRENT DIPOLE Y'],',', \
            data['trajectory'][nstep-1]['extras']['qcvars']['CURRENT DIPOLE Z'])

'''   Data from the end of the optimization trajectory '''
    print('Formula  ',data['trajectory'][nstep-1]['molecule']['name'])
    print('Charge   ',data['trajectory'][nstep-1]['molecule']['molecular_charge'])
    print('Smiles   ',data['trajectory'][nstep-1]['molecule']['extras']['canonical_isomeric_explicit_hydrogen_mapped_smiles'])

    print('FinalMolecule Formula ',data['final_molecule']['name'])
    print('FinalMolecule Charge  ',data['final_molecule']['molecular_charge'])
    print('Smiles   ',data['final_molecule']['extras']['canonical_isomeric_explicit_hydrogen_mapped_smiles'])

    atname=data['final_molecule']['symbols']
    atcoor=data['final_molecule']['geometry']
    natom=len(atname)
    print('Number of atoms ',natom)
    print('Coordinates')
    for i in range(0,natom):
        print(atname[i],'      ',atcoor[3*i],atcoor[3*i+1],atcoor[3*i+2])

    smiles_string=data['final_molecule']['extras']['canonical_isomeric_explicit_hydrogen_mapped_smiles']
    molchg=data['final_molecule']['molecular_charge']
    if molchg  != 0.0 :
      print('Leaving without charges because molecule is an ion.  Charge = ',molchg)
    else:
      mol=Molecule.from_smiles(smiles_string)
      print('mol : ',mol)
      ff=ForceField('openff-1.3.0.offxml')
      sys, returned_top = ff.create_openmm_system(mol.to_topology(), return_topology=True)
      chges=[*returned_top.reference_molecules][0].partial_charges
      print('Charges: ',chges)
      print('charges.type',type(chges))
      ncharge=len(chges)
      print('Number atoms ',natom,' Number charges ',ncharge)
      print('Coordinates and charges')
      for i in range(0,ncharge):
         print(atname[i],'      ',atcoor[3*i],atcoor[3*i+1],atcoor[3*i+2],'  ',chges[i])
jchodera commented 3 years ago

@BillSwope : What SMILES string (or JSON file) are you using as input?

BillSwope commented 3 years ago

The json file is made at the end of the openff-benchmark optimize execute step. I would attach one, but it is extremely long. The code snippet I posted reads that json file, extracts the appropriate smiles string and feeds it to the toolkit.

jchodera commented 3 years ago

@BillSwope Is it possible to extract just the entry for this problematic zwitterionic molecule from the JSON?

BillSwope commented 3 years ago

John, here is the output of the above mentioned program. The charge model is obtained entirely from the smiles string.

Warning: Unable to load toolkit 'OpenEye Toolkit'. The Open Force Field Toolkit does not require the OpenEye Toolkits, and can use RDKit/AmberTools instead. However, if you have a valid license for the OpenEye Toolkits, consider installing them for faster performance and additional file format support: https://docs.eyesopen.com/toolkits/python/quickstart-python/linuxosx.html OpenEye offers free Toolkit licenses for academics: https://www.eyesopen.com/academic-licensing
Opening json file  4-compute-qm/b3lyp-d3bj/dzvp/TST-00000-00.json
Number of steps in trajectory  9
Dipole at last step  25.147027929565652 , -2.901440936395268 , 19.57486037496916
Formula   C12H15NO2
Charge    0.0
Smiles    [c:1]1([H:17])[c:4]([H:5])[c:2]([H:18])[c:7]([C:13]2([H:28])[C:9]([H:20])([H:21])[C:11]([H:24])([H:25])[N+:14]([H:29])([H:30])[C:12]([H:26])([H:27])[C:10]2([H:22])[H:23])[c:3]([H:19])[c:6]1[C:8]([O-:15])=[O:16]
FinalMolecule Formula  C12H15NO2
FinalMolecule Charge   0.0
Smiles    [c:1]1([H:17])[c:4]([H:5])[c:2]([H:18])[c:7]([C:13]2([H:28])[C:9]([H:20])([H:21])[C:11]([H:24])([H:25])[N+:14]([H:29])([H:30])[C:12]([H:26])([H:27])[C:10]2([H:22])[H:23])[c:3]([H:19])[c:6]1[C:8]([O-:15])=[O:16]
Number of atoms  30
Coordinates
C        -5.8180005715160545 -2.754407170052259 -7.031274905600262
C        -2.1805608988088774 -3.131237345997304 -4.260248705882403
C        -6.372296328156536 -2.5597854584836366 -2.521585178853355
C        -3.212494388817736 -3.067487910172059 -6.694053195497136
H        -1.9745161091404255 -3.265534871090415 -8.324484929182868
C        -7.42555867632331 -2.499465459121885 -4.94549404298432
C        -3.77633731581622 -2.8723879862776758 -2.1524728594830087
C        -10.323203278902813 -2.1630080389734183 -5.260860595890214
C        -1.5949087061120817 -5.544494143668891 1.1624019489938078
C        -0.8159847588510307 -0.8365711469150879 1.0243015161739586
C        -0.5978395905101674 -5.644106982573854 3.860375075092555
C        0.18468923492524755 -0.9123418804474829 3.7220817967449413
C        -2.7496325502938244 -2.9513698689784955 0.5159050501009312
N        1.297514974123022 -3.5134481903149726 4.251481071284254
O        -11.517354166282267 -1.9628660860940876 -3.207634196065447
O        -11.085846264936013 -2.138769086159081 -7.508954407799398
H        -6.676327188667401 -2.7008426252503615 -8.89590987857077
H        -0.1474051593992076 -3.381600045805474 -4.029550681152021
H        -7.683972998014265 -2.3540090127947226 -0.9497364516188664
H        -0.07468597677338693 -6.010673362217021 -0.16869772572604835
H        -3.02985036405744 -7.016100106377006 0.9465957733825625
H        -1.7005611014816853 1.0051878506818175 0.7129547796606667
H        0.763545098578647 -0.9598331965037347 -0.31326285054893627
H        -2.093873894832454 -5.330490954499666 5.249578346130126
H        0.38137301019087444 -7.406422355139553 4.309513975377626
H        1.6827844033158252 0.46373062666056264 4.080611240808984
H        -1.3198526683753837 -0.6491204818342753 5.112528384742475
H        -4.351343918771352 -2.6491216856353197 1.7996680307345598
H        2.0130596672437306 -3.57903972636637 6.050282765334052
H        2.7888357264625943 -3.795704499598268 3.044048560293557
Mol is  Molecule with name '' and SMILES '[H][c]1[c]([H])[c]([C](=[O])[O-])[c]([H])[c]([C]2([H])[C]([H])([H])[C]([H])([H])[N+]([H])([H])[C]([H])([H])[C]2([H])[H])[c]1[H]'
mol :  Molecule with name '' and SMILES '[H][c]1[c]([H])[c]([C](=[O])[O-])[c]([H])[c]([C]2([H])[C]([H])([H])[C]([H])([H])[N+]([H])([H])[C]([H])([H])[C]2([H])[H])[c]1[H]'
Charges:  [-0.071    0.158   -0.139    0.141   -0.101    0.136   -0.0793  -0.0144
  0.0517  -0.0819   0.05095  0.05095  0.1623   0.0332   0.0332  -1.031
  0.3988   0.3988   0.1623   0.0332   0.0332  -0.0819   0.05095  0.05095
 -0.099    0.154   -0.1746   0.8922  -0.5603  -0.5603 ] e
charges.type <class 'simtk.unit.quantity.Quantity'>
Number atoms  30  Number charges  30
Coordinates and charges
C        -5.8180005715160545 -2.754407170052259 -7.031274905600262    -0.071 e
C        -2.1805608988088774 -3.131237345997304 -4.260248705882403    0.158 e
C        -6.372296328156536 -2.5597854584836366 -2.521585178853355    -0.139 e
C        -3.212494388817736 -3.067487910172059 -6.694053195497136    0.141 e
H        -1.9745161091404255 -3.265534871090415 -8.324484929182868    -0.101 e
C        -7.42555867632331 -2.499465459121885 -4.94549404298432    0.136 e
C        -3.77633731581622 -2.8723879862776758 -2.1524728594830087    -0.0793 e
C        -10.323203278902813 -2.1630080389734183 -5.260860595890214    -0.0144 e
C        -1.5949087061120817 -5.544494143668891 1.1624019489938078    0.0517 e
C        -0.8159847588510307 -0.8365711469150879 1.0243015161739586    -0.0819 e
C        -0.5978395905101674 -5.644106982573854 3.860375075092555    0.05095 e
C        0.18468923492524755 -0.9123418804474829 3.7220817967449413    0.05095 e
C        -2.7496325502938244 -2.9513698689784955 0.5159050501009312    0.1623 e
N        1.297514974123022 -3.5134481903149726 4.251481071284254    0.0332 e
O        -11.517354166282267 -1.9628660860940876 -3.207634196065447    0.0332 e
O        -11.085846264936013 -2.138769086159081 -7.508954407799398    -1.031 e
H        -6.676327188667401 -2.7008426252503615 -8.89590987857077    0.3988 e
H        -0.1474051593992076 -3.381600045805474 -4.029550681152021    0.3988 e
H        -7.683972998014265 -2.3540090127947226 -0.9497364516188664    0.1623 e
H        -0.07468597677338693 -6.010673362217021 -0.16869772572604835    0.0332 e
H        -3.02985036405744 -7.016100106377006 0.9465957733825625    0.0332 e
H        -1.7005611014816853 1.0051878506818175 0.7129547796606667    -0.0819 e
H        0.763545098578647 -0.9598331965037347 -0.31326285054893627    0.05095 e
H        -2.093873894832454 -5.330490954499666 5.249578346130126    0.05095 e
H        0.38137301019087444 -7.406422355139553 4.309513975377626    -0.099 e
H        1.6827844033158252 0.46373062666056264 4.080611240808984    0.154 e
H        -1.3198526683753837 -0.6491204818342753 5.112528384742475    -0.1746 e
H        -4.351343918771352 -2.6491216856353197 1.7996680307345598    0.8922 e
H        2.0130596672437306 -3.57903972636637 6.050282765334052    -0.5603 e
H        2.7888357264625943 -3.795704499598268 3.044048560293557    -0.5603 e

Note that these charges are not in the correct order and had to be permuted. (David Dotson has since given me code to extract them in the correct order.)

Here are the coordinates (bohr) and charges (in units of e) in the correct order. At the end is the dipole moments (b3lyp and openff-1.3.0).

C        -5.81800  -2.75441  -7.03127    -0.07100
C        -2.18056  -3.13124  -4.26025    -0.10100
C        -6.37230  -2.55979  -2.52159    -0.09900
C        -3.21249  -3.06749  -6.69405    -0.13900
H        -1.97452  -3.26553  -8.32448     0.14100
C        -7.42556  -2.49947  -4.94549    -0.17460
C        -3.77634  -2.87239  -2.15247    -0.07930
C       -10.32320  -2.16301  -5.26086     0.89220
C        -1.59491  -5.54449   1.16240    -0.08190
C        -0.81598  -0.83657   1.02430    -0.08190
C        -0.59784  -5.64411   3.86038     0.16230
C         0.18469  -0.91234   3.72208     0.16230
C        -2.74963  -2.95137   0.51591    -0.01440
N         1.29751  -3.51345   4.25148    -1.03100
O       -11.51735  -1.96287  -3.20763    -0.56030
O       -11.08585  -2.13877  -7.50895    -0.56030
H        -6.67633  -2.70084  -8.89591     0.15800
H        -0.14741  -3.38160  -4.02955     0.13600
H        -7.68397  -2.35401  -0.94974     0.15400
H        -0.07469  -6.01067  -0.16870     0.05095
H        -3.02985  -7.01610   0.94660     0.05095
H        -1.70056   1.00519   0.71295     0.05095
H         0.76355  -0.95983  -0.31326     0.05095
H        -2.09387  -5.33049   5.24958     0.03320
H         0.38137  -7.40642   4.30951     0.03320
H         1.68278   0.46373   4.08061     0.03320
H        -1.31985  -0.64912   5.11253     0.03320
H        -4.35134  -2.64912   1.79967     0.05170
H         2.01306  -3.57904   6.05028     0.39880
H         2.78884  -3.79570   3.04405     0.39880
Net charge   -0.00200
Error - charge mismatch Charge    0.00000 Qtot   -0.00200
QMDipole(Debye)   25.15  -2.90  19.57 ; Mag   32.00 ; 00000-00
MMDipole(Debye)   11.43  -1.35   5.70 ; Mag   12.84 ; 00000-00
Angular deviation  11.35 (degrees); |delta D|  19.57 ; delta|D| -19.16 ; 00000-00
BillSwope commented 3 years ago

Here is the output of conda list for the openff-toolkit environment I used for the generation of the charges:

# packages in environment at /gstore/home/swopew/.conda/envs/openff-toolkit:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
amberlite                 16.0                     pypi_0    pypi
ambertools                20.15                    pypi_0    pypi
arpack                    3.7.0                hdefa2d7_2    conda-forge
astunparse                1.6.3              pyhd8ed1ab_0    conda-forge
blosc                     1.21.0               h9c3ff4c_0    conda-forge
boost                     1.74.0           py39h5472131_3    conda-forge
boost-cpp                 1.74.0               hc6e9bd1_2    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.17.1               h7f98852_1    conda-forge
ca-certificates           2020.12.5            ha878542_0    conda-forge
cairo                     1.16.0            h6cf1ce9_1008    conda-forge
certifi                   2020.12.5        py39hf3d152e_1    conda-forge
cudatoolkit               11.2.2               he111cf0_8    conda-forge
curl                      7.76.1               h979ede3_0    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cython                    0.29.23          py39he80948d_0    conda-forge
decorator                 5.0.7              pyhd8ed1ab_0    conda-forge
fftw                      3.3.9           nompi_h74d3f13_101    conda-forge
fontconfig                2.13.1            hba837de_1005    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
gettext                   0.19.8.1          h0b5b191_1005    conda-forge
greenlet                  1.0.0            py39he80948d_0    conda-forge
hdf4                      4.2.13            h10796ff_1005    conda-forge
hdf5                      1.10.6          nompi_h6a2412b_1114    conda-forge
icu                       68.1                 h58526e2_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
kiwisolver                1.3.1            py39h1a9c180_1    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libcurl                   7.76.1               hc4aaa36_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_19    conda-forge
libgfortran-ng            9.3.0               hff62375_19    conda-forge
libgfortran5              9.3.0               hff62375_19    conda-forge
libglib                   2.68.1               h3e27bee_0    conda-forge
libgomp                   9.3.0               h2828fa1_19    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libnetcdf                 4.7.4           nompi_h56d31a8_107    conda-forge
libnghttp2                1.43.0               h812cca2_0    conda-forge
libopenblas               0.3.12          pthreads_h4812303_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libssh2                   1.9.0                ha56f1ee_6    conda-forge
libstdcxx-ng              9.3.0               h6de172a_19    conda-forge
libtiff                   4.2.0                hdc55705_0    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libwebp-base              1.2.0                h7f98852_2    conda-forge
libxcb                    1.13              h7f98852_1003    conda-forge
libxml2                   2.9.10               h72842e0_4    conda-forge
llvm-openmp               11.1.0               h4bd325d_1    conda-forge
lz4-c                     1.9.3                h9c3ff4c_0    conda-forge
lzo                       2.10              h516909a_1000    conda-forge
matplotlib-base           3.4.1            py39h2fa2bec_0    conda-forge
mdtraj                    1.9.5            py39h138c130_1    conda-forge
mkl                       2020.4             h726a3e6_304    conda-forge
mmpbsa-py                 16.0                     pypi_0    pypi
mock                      4.0.3            py39hf3d152e_1    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
netcdf-fortran            4.5.3           nompi_h996563d_103    conda-forge
networkx                  2.5                        py_0    conda-forge
numexpr                   2.7.3            py39hde0f152_0    conda-forge
numpy                     1.20.2           py39hdbf815f_0    conda-forge
ocl-icd                   2.3.0                h7f98852_0    conda-forge
ocl-icd-system            1.0.0                         1    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openff-forcefields        1.3.0              pyh44b312d_0    conda-forge
openff-toolkit            0.9.1              pyhd8ed1ab_2    conda-forge
openff-toolkit-base       0.9.1              pyhd8ed1ab_2    conda-forge
openjpeg                  2.4.0                hf7af979_0    conda-forge
openmm                    7.5.0            py39hd1fbf24_6    conda-forge
openssl                   1.1.1k               h7f98852_0    conda-forge
packaging                 20.9               pyh44b312d_0    conda-forge
packmol-memgen            1.1.0rc0                 pypi_0    pypi
pandas                    1.2.4            py39hde0f152_0    conda-forge
parmed                    3.4.1            py39he80948d_0    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pdb4amber                 1.7.dev0                 pypi_0    pypi
perl                      5.32.0               h36c2ea0_0    conda-forge
pillow                    8.1.2            py39hf95b381_1    conda-forge
pip                       21.0.1             pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               h36c2ea0_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pycairo                   1.20.0           py39hedcb9fc_1    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pytables                  3.6.1            py39hf6dc253_3    conda-forge
python                    3.9.2           hffdb5ce_0_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                3.9                      1_cp39    conda-forge
pytraj                    2.0.5                    pypi_0    pypi
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
rdkit                     2021.03.1        py39hccf6a74_0    conda-forge
readline                  8.0                  he28a2e2_2    conda-forge
reportlab                 3.5.67           py39he59360d_0    conda-forge
sander                    16.0                     pypi_0    pypi
scipy                     1.6.2            py39hee8e79c_0    conda-forge
setuptools                49.6.0           py39hf3d152e_3    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
smirnoff99frosst          1.1.0              pyh44b312d_0    conda-forge
snappy                    1.1.8                he1b5a44_3    conda-forge
sqlalchemy                1.4.9            py39h3811e60_0    conda-forge
sqlite                    3.35.4               h74cdb3f_0    conda-forge
tk                        8.6.10               h21135ba_1    conda-forge
tornado                   6.1              py39h3811e60_1    conda-forge
tzdata                    2021a                he74cb21_0    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xmltodict                 0.12.0                     py_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h7f98852_0    conda-forge
xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
xorg-libx11               1.7.0                h7f98852_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h7f98852_1    conda-forge
xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
xorg-libxt                1.2.1                h7f98852_2    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.4.9                ha95c52a_0    conda-forge
BillSwope commented 3 years ago

Here is the output of conda list for the environment in which I ran the openff-benchmark optimize execute operation:

# packages in environment at /gstore/home/swopew/.conda/envs/openff-benchmark-optimization:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                      1_llvm    conda-forge
abseil-cpp                20210324.0           h9c3ff4c_0    conda-forge
alembic                   1.5.8              pyhd8ed1ab_0    conda-forge
alsa-lib                  1.2.3                h516909a_0    conda-forge
amberlite                 16.0                     pypi_0    pypi
ambertools                17.0                     pypi_0    pypi
ambit                     0.5.1                hbda204a_0    psi4/label/dev
argcomplete               1.12.3             pyhd8ed1ab_2    conda-forge
argon2-cffi               20.1.0           py37h5e8e339_2    conda-forge
arpack                    3.7.0                hc6cf775_2    conda-forge
arrow-cpp                 3.0.0           py37he4eac6b_11_cpu    conda-forge
astunparse                1.6.3              pyhd8ed1ab_0    conda-forge
async_generator           1.10                       py_0    conda-forge
attrs                     20.3.0             pyhd3deb0d_0    conda-forge
aws-c-cal                 0.4.5                h76129ab_8    conda-forge
aws-c-common              0.5.2                h7f98852_0    conda-forge
aws-c-event-stream        0.2.7                h6bac3ce_1    conda-forge
aws-c-io                  0.9.1                ha5b09cb_1    conda-forge
aws-checksums             0.1.11               h99e32c3_3    conda-forge
aws-sdk-cpp               1.8.151              hceb1b1e_1    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
basis_set_exchange        0.8.13                     py_0    conda-forge
bcrypt                    3.2.0            py37h5e8e339_1    conda-forge
blas                      1.0                         mkl    conda-forge
bleach                    3.3.0              pyh44b312d_0    conda-forge
blosc                     1.21.0               h9c3ff4c_0    conda-forge
boost                     1.72.0           py37h48f8a5e_1    conda-forge
boost-cpp                 1.72.0               h9d3c048_4    conda-forge
brotli                    1.0.9                h9c3ff4c_4    conda-forge
brotlipy                  0.7.0           py37h5e8e339_1001    conda-forge
bson                      0.5.9                      py_0    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.17.1               h7f98852_1    conda-forge
ca-certificates           2020.12.5            ha878542_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cairo                     1.16.0            h6cf1ce9_1008    conda-forge
certifi                   2020.12.5        py37h89c1867_1    conda-forge
cffi                      1.14.5           py37hc58025e_0    conda-forge
chardet                   4.0.0            py37h89c1867_1    conda-forge
chemps2                   1.8.9                hbda204a_2    psi4/label/dev
click                     7.1.2              pyh9f0ad1d_0    conda-forge
cloudpickle               1.6.0                      py_0    conda-forge
codecov                   2.1.11             pyhd3deb0d_0    conda-forge
coverage                  5.5              py37h5e8e339_0    conda-forge
cryptography              3.4.7            py37h5d9358c_0    conda-forge
curl                      7.76.1               h979ede3_1    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cython                    0.29.23          py37hcd2ae1e_0    conda-forge
cytoolz                   0.11.0           py37h5e8e339_3    conda-forge
dask-core                 2021.4.1           pyhd8ed1ab_0    conda-forge
dask-jobqueue             0.7.2              pyhd8ed1ab_1    conda-forge
dbus                      1.13.6               h48d8840_2    conda-forge
decorator                 5.0.7              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
dftd3                     3.2.1                h84218bc_2    psi4/label/dev
distributed               2021.4.1         py37h89c1867_0    conda-forge
dkh                       1.2                  h173d85e_2    psi4/label/dev
double-conversion         3.1.5                h9c3ff4c_2    conda-forge
entrypoints               0.3             pyhd8ed1ab_1003    conda-forge
expat                     2.3.0                h9c3ff4c_0    conda-forge
fftw                      3.3.8           nompi_hfc0cae8_1114    conda-forge
fftw3f                    3.3.4                         2    omnia
fontconfig                2.13.1            hba837de_1005    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
fsspec                    2021.4.0           pyhd8ed1ab_0    conda-forge
gau2grid                  2.0.3                h0dc56a0_0    psi4/label/dev
gcp                       2.0.2                he991be0_2    psi4/label/dev
gdma                      2.2.6                h0e1e685_6    psi4/label/dev
geometric                 0.9.7.2                    py_0    conda-forge
gettext                   0.19.8.1          h0b5b191_1005    conda-forge
gflags                    2.2.2             he1b5a44_1004    conda-forge
glib                      2.68.1               h9c3ff4c_0    conda-forge
glib-tools                2.68.1               h9c3ff4c_0    conda-forge
glog                      0.4.0                h49b9bf7_3    conda-forge
grpc-cpp                  1.37.0               h36de60a_1    conda-forge
gst-plugins-base          1.18.4               hf529b03_2    conda-forge
gstreamer                 1.18.4               h76c114f_2    conda-forge
h5py                      3.2.1           nompi_py37ha3df211_100    conda-forge
hdf4                      4.2.13            h10796ff_1005    conda-forge
hdf5                      1.10.6          nompi_h7c3c948_1111    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
icu                       68.1                 h58526e2_0    conda-forge
idna                      2.10               pyh9f0ad1d_0    conda-forge
importlib-metadata        4.0.1            py37h89c1867_0    conda-forge
importlib_metadata        4.0.1                hd8ed1ab_0    conda-forge
importlib_resources       5.1.2            py37h89c1867_0    conda-forge
iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
intel-openmp              2021.2.0           h06a4308_610  
ipykernel                 5.5.3            py37h085eea5_0    conda-forge
ipython                   7.22.0           py37h085eea5_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.6.3              pyhd3deb0d_0    conda-forge
jedi                      0.18.0           py37h89c1867_2    conda-forge
jinja2                    2.11.3             pyh44b312d_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
jsonschema                3.2.0              pyhd8ed1ab_3    conda-forge
jupyter_client            6.1.12             pyhd8ed1ab_0    conda-forge
jupyter_core              4.7.1            py37h89c1867_0    conda-forge
jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
jupyterlab_widgets        1.0.0              pyhd8ed1ab_1    conda-forge
kiwisolver                1.3.1            py37h2527ec5_1    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
libblas                   3.9.0                     8_mkl    conda-forge
libcblas                  3.9.0                     8_mkl    conda-forge
libclang                  11.1.0          default_ha53f305_0    conda-forge
libcurl                   7.76.1               hc4aaa36_1    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libevent                  2.1.10               hcdb4288_3    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_19    conda-forge
libgfortran-ng            7.5.0               h14aa051_19    conda-forge
libgfortran4              7.5.0               h14aa051_19    conda-forge
libglib                   2.68.1               h3e27bee_0    conda-forge
libgomp                   9.3.0               h2828fa1_19    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
libint                    1.2.1                hb4a4fd4_6    psi4/label/dev
liblapack                 3.9.0                     8_mkl    conda-forge
libllvm11                 11.1.0               hf817b99_2    conda-forge
libnetcdf                 4.7.4           nompi_h56d31a8_107    conda-forge
libnghttp2                1.43.0               h812cca2_0    conda-forge
libogg                    1.3.4                h7f98852_1    conda-forge
libopus                   1.3.1                h7f98852_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libpq                     13.2                 hfd2b0eb_2    conda-forge
libprotobuf               3.15.8               h780b84a_0    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libssh2                   1.9.0                ha56f1ee_6    conda-forge
libstdcxx-ng              9.3.0               h6de172a_19    conda-forge
libthrift                 0.14.1               he6d91bd_1    conda-forge
libtiff                   4.2.0                hdc55705_1    conda-forge
libutf8proc               2.6.1                h7f98852_0    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libvorbis                 1.3.7                h9c3ff4c_0    conda-forge
libwebp-base              1.2.0                h7f98852_2    conda-forge
libxc                     4.3.4                h6e990d7_2    conda-forge
libxcb                    1.13              h7f98852_1003    conda-forge
libxkbcommon              1.0.3                he3ba5ed_0    conda-forge
libxml2                   2.9.10               h72842e0_4    conda-forge
libxslt                   1.1.33               h15afd5d_2    conda-forge
llvm-openmp               11.1.0               h4bd325d_1    conda-forge
locket                    0.2.0                      py_2    conda-forge
lxml                      4.6.3            py37h77fd288_0    conda-forge
lz4-c                     1.9.3                h9c3ff4c_0    conda-forge
lzo                       2.10              h516909a_1000    conda-forge
mako                      1.1.4              pyh44b312d_0    conda-forge
markupsafe                1.1.1            py37h5e8e339_3    conda-forge
matplotlib                3.4.1            py37h89c1867_0    conda-forge
matplotlib-base           3.4.1            py37hdd32ed1_0    conda-forge
mdtraj                    1.9.5            py37hd0d7e5a_1    conda-forge
mistune                   0.8.4           py37h5e8e339_1003    conda-forge
mkl                       2020.4             h726a3e6_304    conda-forge
mkl-service               2.3.0            py37h8f50634_2    conda-forge
mmpbsa-py                 16.0                     pypi_0    pypi
mock                      4.0.3            py37h89c1867_1    conda-forge
more-itertools            8.7.0              pyhd8ed1ab_1    conda-forge
msgpack-python            1.0.2            py37h2527ec5_1    conda-forge
mysql-common              8.0.23               ha770c72_1    conda-forge
mysql-libs                8.0.23               h935591d_1    conda-forge
nbclient                  0.5.3              pyhd8ed1ab_0    conda-forge
nbconvert                 6.0.7            py37h89c1867_3    conda-forge
nbformat                  5.1.3              pyhd8ed1ab_0    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
nest-asyncio              1.5.1              pyhd8ed1ab_0    conda-forge
netcdf-fortran            4.5.3           nompi_hfef6a68_101    conda-forge
networkx                  2.4                        py_1    conda-forge
nglview                   3.0.1              pyh59e0f4d_0    conda-forge
notebook                  6.3.0              pyha770c72_1    conda-forge
nspr                      4.30                 h9c3ff4c_0    conda-forge
nss                       3.64                 hb5efdd6_0    conda-forge
numexpr                   2.7.3            py37hdc94413_0    conda-forge
numpy                     1.20.2           py37h038b26d_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openff-benchmark          2021.04.09.0               py_0    omnia/label/benchmark
openff-qcsubmit           0.2.1                      py_1    omnia/label/rc
openforcefield            0.8.4              pyh39e3cac_0    omnia
openforcefields           1.3.0                      py_0    omnia
openjpeg                  2.4.0                hf7af979_0    conda-forge
openmm                    7.4.2           py37_cuda101_rc_1    omnia
openmmforcefields         0.8.0                    py37_0    omnia
openssl                   1.1.1k               h7f98852_0    conda-forge
orc                       1.6.7                heec2584_1    conda-forge
packaging                 20.9               pyh44b312d_0    conda-forge
packmol-memgen            1.1.0rc0                 pypi_0    pypi
pandas                    1.2.4            py37h219a48f_0    conda-forge
pandoc                    2.12                 h7f98852_0    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
parmed                    at20RC5+54.g5702a232fe.dirty          pypi_0    pypi
parquet-cpp               1.5.1                         2    conda-forge
parso                     0.8.2              pyhd8ed1ab_0    conda-forge
partd                     1.2.0              pyhd8ed1ab_0    conda-forge
patsy                     0.5.1                      py_0    conda-forge
pcmsolver                 1.2.1.1          py37h6d17ec8_2    psi4/label/dev
pcre                      8.44                 he1b5a44_0    conda-forge
pdb4amber                 1.7.dev0                 pypi_0    pypi
perl                      5.32.0               h36c2ea0_0    conda-forge
pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    8.1.2            py37h4600e1f_1    conda-forge
pint                      0.17               pyhd8ed1ab_0    conda-forge
pip                       21.1               pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               h36c2ea0_0    conda-forge
plotly                    4.14.3             pyh44b312d_0    conda-forge
pluggy                    0.13.1           py37h89c1867_4    conda-forge
postgresql                13.2                 h6303168_2    conda-forge
prometheus_client         0.10.1             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.18             pyha770c72_0    conda-forge
psi4                      1.4a2.dev1058+670a850  py37hcd683b5_0    psi4/label/dev
psutil                    5.8.0            py37h5e8e339_1    conda-forge
psycopg2                  2.8.6            py37h5e8e339_2    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
py                        1.10.0             pyhd3deb0d_0    conda-forge
py-cpuinfo                8.0.0              pyhd8ed1ab_0    conda-forge
pyarrow                   3.0.0           py37he2832ee_11_cpu    conda-forge
pycairo                   1.20.0           py37hfff247e_1    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pydantic                  1.8.1            py37h5e8e339_1    conda-forge
pygments                  2.8.1              pyhd8ed1ab_0    conda-forge
pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyqt                      5.12.3           py37h89c1867_7    conda-forge
pyqt-impl                 5.12.3           py37he336c9b_7    conda-forge
pyqt5-sip                 4.19.18          py37hcd2ae1e_7    conda-forge
pyqtchart                 5.12             py37he336c9b_7    conda-forge
pyqtwebengine             5.12.1           py37he336c9b_7    conda-forge
pyrsistent                0.17.3           py37h5e8e339_2    conda-forge
pysocks                   1.7.1            py37h89c1867_3    conda-forge
pytables                  3.6.1            py37h0c4f3e0_3    conda-forge
pytest                    6.2.3            py37h89c1867_0    conda-forge
pytest-cov                2.11.1             pyh44b312d_0    conda-forge
python                    3.7.10          hffdb5ce_100_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python-editor             1.0.4                      py_0    conda-forge
python_abi                3.7                     1_cp37m    conda-forge
pytraj                    2.0.5                    pypi_0    pypi
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pyyaml                    5.4.1            py37h5e8e339_0    conda-forge
pyzmq                     22.0.3           py37h336d617_1    conda-forge
qcelemental               0.19.0             pyhd8ed1ab_0    conda-forge
qcengine                  0.17.0                     py_0    conda-forge
qcfractal                 0.15.1           py37h89c1867_0    conda-forge
qcfractal-core            0.15.1           py37h89c1867_0    conda-forge
qcportal                  0.14.0                     py_1    conda-forge
qt                        5.12.9               hda022c4_4    conda-forge
rdkit                     2020.09.3        py37h400b6df_0    conda-forge
re2                       2021.04.01           h9c3ff4c_0    conda-forge
readline                  8.1                  h46c0cb4_0    conda-forge
reportlab                 3.5.67           py37h69800bb_0    conda-forge
requests                  2.25.1             pyhd3deb0d_0    conda-forge
retrying                  1.3.3                      py_2    conda-forge
s2n                       1.0.0                h9b69904_0    conda-forge
sander                    16.0                     pypi_0    pypi
scipy                     1.5.3            py37h8911b10_0    conda-forge
seaborn                   0.11.1               hd8ed1ab_1    conda-forge
seaborn-base              0.11.1             pyhd8ed1ab_1    conda-forge
send2trash                1.5.0                      py_0    conda-forge
setuptools                49.6.0           py37h89c1867_3    conda-forge
simint                    0.7                  h642920c_1    psi4/label/dev
six                       1.15.0             pyh9f0ad1d_0    conda-forge
smirnoff99frosst          1.1.0              pyh44b312d_0    conda-forge
snappy                    1.1.8                he1b5a44_3    conda-forge
sortedcontainers          2.3.0              pyhd8ed1ab_0    conda-forge
sqlalchemy                1.3.23           py37h5e8e339_0    conda-forge
sqlite                    3.35.5               h74cdb3f_0    conda-forge
statsmodels               0.12.2           py37h902c9e0_0    conda-forge
tblib                     1.7.0              pyhd8ed1ab_0    conda-forge
terminado                 0.9.4            py37h89c1867_0    conda-forge
testpath                  0.4.4                      py_0    conda-forge
tinydb                    4.4.0              pyh44b312d_0    conda-forge
tk                        8.6.10               h21135ba_1    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
toolz                     0.11.1                     py_0    conda-forge
tornado                   6.1              py37h5e8e339_1    conda-forge
torsiondrive              0.9.8.1                    py_0    conda-forge
tqdm                      4.60.0             pyhd8ed1ab_0    conda-forge
traitlets                 5.0.5                      py_0    conda-forge
typing-extensions         3.7.4.3                       0    conda-forge
typing_extensions         3.7.4.3                    py_0    conda-forge
tzcode                    2021a                h7f98852_1    conda-forge
tzdata                    2021a                he74cb21_0    conda-forge
urllib3                   1.26.4             pyhd8ed1ab_0    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                      py_1    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
widgetsnbextension        3.5.1            py37h89c1867_4    conda-forge
xmltodict                 0.12.0                     py_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.0.10               h7f98852_0    conda-forge
xorg-libsm                1.2.3             hd9c2040_1000    conda-forge
xorg-libx11               1.7.0                h7f98852_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h7f98852_1    conda-forge
xorg-libxrender           0.9.10            h7f98852_1003    conda-forge
xorg-libxt                1.2.1                h7f98852_2    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h7f98852_1002    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yaml                      0.2.5                h516909a_0    conda-forge
zeromq                    4.3.4                h9c3ff4c_0    conda-forge
zict                      2.0.0                      py_0    conda-forge
zipp                      3.4.1              pyhd8ed1ab_0    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.4.9                ha95c52a_0    conda-forge
j-wags commented 3 years ago

Excellent. Thanks, @BillSwope -- Reviewing this now.

j-wags commented 3 years ago

Ok! I'm able to reproduce the problem. I don't know what's going on, but a workaround while I dig deeper would be to switch mol=Molecule.from_smiles(smiles_string) to mol=Molecule.from_mapped_smiles(smiles_string). This switches the final charges from the ones that you were getting to the ones that I was getting.

I'm really confused by the different charges depending on HOW the atom is loaded from the mapped SMILES. This is almost certainly a bug and I'll keep digging into it.

Also, to get the quantities in a sane/unitless format, the following code snippet may help:

from simtk import unit
for atom in offmol.atoms:
    charge = atom.partial_charge.value_in_unit(unit.elementary_charge)
    print(charge)
j-wags commented 3 years ago

Ok -- Here are the results of my investigation from today:

Minimal reproducing case of bad behavior

smiles = '[c:1]1([H:17])[c:4]([H:5])[c:2]([H:18])[c:7]([C:13]2([H:28])[C:9]([H:20])([H:21])[C:11]([H:24])([H:25])[N+:14]([H:29])([H:30])[C:12]([H:26])([H:27])[C:10]2([H:22])[H:23])[c:3]([H:19])[c:6]1[C:8]([O-:15])=[O:16]'
mol1 = Molecule.from_smiles(smiles)

mol1.assign_partial_charges(partial_charge_method='AM1BCC')
print(mol1.partial_charges)

[-0.071 0.158 -0.139 0.141 -0.101 0.136 -0.0793 -0.0144 0.0517 -0.0819 0.05095 0.05095 0.1623 0.0332 0.0332 -1.031 0.3988 0.3988 0.1623 0.0332 0.0332 -0.0819 0.05095 0.05095 -0.099 0.154 -0.1746 0.8922 -0.5603 -0.5603 ] e

I'll call the -1.031 at the end of the first line the "bad value", and we'll see where else it comes up.

Loading using from_mapped_smiles doesn't give the "bad value"

mol2 = Molecule.from_mapped_smiles(smiles)

mol2.assign_partial_charges(partial_charge_method='AM1BCC')
print(mol2.partial_charges)

[-0.082 -0.155 -0.092 -0.149 0.12 -0.1206 -0.1613 0.9102 -0.1084 -0.1084 0.1108 0.1108 -0.0034 -0.763 -0.8113 -0.8113 0.167 0.093 0.156 0.09395 0.09395 0.09395 0.09395 0.09345 0.09345 0.09345 0.09345 0.0547 0.4468 0.4468 ] e

No instances of the bad value here, and the charges are less extreme (this is the same result as loading from SDF)

So the graphs are different?

print(mol1.to_smiles() == mol2.to_smiles())
print(mol1.is_isomorphic_with(mol2))

True True

Nope.

So, something with atom maps?

Well, Molecule.from_smiles stores the atom maps in offmol.properties, so maybe something in the toolchain is misinterpreting those during conf gen or charge calc?

mol3 = Molecule.from_smiles(smiles)
del mol3.properties['atom_map']
mol3.assign_partial_charges(partial_charge_method='AM1BCC')
print(mol3.partial_charges)

[-0.071 0.158 -0.139 0.141 -0.101 0.136 -0.0793 -0.0144 0.0517 -0.0819 0.05095 0.05095 0.1623 0.0332 0.0332 -1.031 0.3988 0.3988 0.1623 0.0332 0.0332 -0.0819 0.05095 0.05095 -0.099 0.154 -0.1746 0.8922 -0.5603 -0.5603 ] e

This has the bad value (-1.031), even though atom maps have been erased before we initiated the AM1BCC calculation.

So, does the order of the atoms change the calculated charge?

Let's take the "good molecule" mol2 and remap its atoms to be in the same order as mol1:

atom_map = dict([(j-1,i) for i,j in mol1.properties['atom_map'].items()])
mol4 = mol2.remap(atom_map)
mol4.assign_partial_charges(partial_charge_method='AM1BCC')
print(mol4.partial_charges)

[-0.071 0.158 -0.139 0.141 -0.101 0.136 -0.0793 -0.0144 0.0517 -0.0819 0.05095 0.05095 0.1623 0.0332 0.0332 -1.031 0.3988 0.3988 0.1623 0.0332 0.0332 -0.0819 0.05095 0.05095 -0.099 0.154 -0.1746 0.8922 -0.5603 -0.5603 ] e

And, now mol2 gives the bad value as well.

So, at this point I have two thoughts. Either:

j-wags commented 3 years ago

Reposting more breadcrumbs from @BillSwope from earlier today:

looking at the charges, there are some interesting differences. For the 6 membered ring with the nitrogen in it, one set of charges has symmetrized charges on 8 of the hydrogens (all 8 are 0.094) but the other one has symmetrized them differently, as two sets of four (4 have 0.051, and the other 4 have 0.0332). These even sum to very different numbers. So, the charge symmetrization strategies are different in the two cases.

j-wags commented 3 years ago

Using the OpenEye backend to calculate partial charges for mol1 and mol2 above gives very close values (giving values within 0.006 e- of each other).

from simtk import unit
for atom_idx in range(mol1.n_atoms):
    mol1_chg = mol1.atoms[atom_idx].partial_charge.value_in_unit(unit.elementary_charge)
    mol2_chg = mol2.atoms[mol1.properties['atom_map'][atom_idx]-1].partial_charge.value_in_unit(unit.elementary_charge)
    symbol = mol1.atoms[atom_idx].element.symbol
    print(f'{symbol} {mol1_chg:.3f} {mol2_chg:.3f}')

C -0.078 -0.079 H 0.165 0.171 C -0.154 -0.152 H 0.121 0.121 C -0.142 -0.141 H 0.105 0.104 C -0.161 -0.161 C -0.000 0.002 H 0.048 0.048 C -0.110 -0.112 H 0.099 0.099 H 0.099 0.099 C 0.110 0.111 H 0.093 0.093 H 0.093 0.093 N -0.765 -0.765 H 0.446 0.446 H 0.446 0.446 C 0.110 0.111 H 0.093 0.093 H 0.093 0.093 C -0.110 -0.112 H 0.099 0.099 H 0.099 0.099 C -0.106 -0.106 H 0.141 0.140 C -0.118 -0.124 C 0.911 0.911 O -0.815 -0.815 O -0.815 -0.815

j-wags commented 3 years ago

Here are the RDKit-generated conformers for mol1 and mol2, which should be the same as the ones used for charge calculation. The unsaturated ring flips, but otherwise there's nothing unusual here.

Screen Shot 2021-05-05 at 7 58 06 PM
SimonBoothroyd commented 3 years ago

@cbayly13 may like this one... @j-wags the flipped ring seems to actually make a big difference. If you look at the final minimized structure (note this is AT not OE which does not do restrained minimization) that sqm uses to compute the final partial charges, you can see that a proton transfer has occured:

Initial structure:

Screenshot 2021-05-06 at 10 33 51

Minimized structure:

Screenshot 2021-05-06 at 10 33 27

SimonBoothroyd commented 3 years ago

the order of the atoms changes the resulting partial charges

This can be true - see #926.

BillSwope commented 3 years ago

Simon, Nice work!

Sent from iPhone

On May 6, 2021, at 2:41 AM, SimonBoothroyd @.***> wrote:

 the order of the atoms changes the resulting partial charges

This is true - see #926.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

j-wags commented 3 years ago

Great find, @SimonBoothroyd!

So fixing #926 will make it so that different orderings get the same output, which is good for reproducibility. But it is still free to pick an order that leads to a proton transfer. So the more proximal issue is the proton transfer itself. I'll update this issue title+description, since we've been talking about Antechamber/sqm AM1 restraints for a long time without an issue to track it.