Assigning partial charges for polymers/parameterizing polymers

kcreemer commented 4 years ago

Describe the bug To study the interaction between a polymer and an API, we want to parameterize polymers ranging from 50 to 600 monomer units. This to perform a molecular dynamics calculation with openff. However from the moment we reach +300 atoms, calculations of the partial charges is no longer possible.

To Reproduce The setup of the system is done using smiles codes and openbabel (and pybel). It is implemented via the following way: off_polymer_system = off_forcefield.create_openmm_system(off_polymer_topology)

Output Exception Traceback (most recent call last)

in 6 7 off_polymer_topology = Topology.from_openmm(polymer.topology,unique_molecules=uniq_molecules) ----> 8 off_polymer_system = off_forcefield.create_openmm_system(off_polymer_topology) ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/typing/engines/smirnoff/forcefield.py in create_openmm_system(self, topology, **kwargs) 1136 # Add forces and parameters to the System 1137 for parameter_handler in parameter_handlers: -> 1138 parameter_handler.create_force(system, topology, **kwargs) 1139 1140 # Let force Handlers do postprocessing ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/typing/engines/smirnoff/parameters.py in create_force(self, system, topology, **kwargs) 2937 toolkit_registry = kwargs.get('toolkit_registry', GLOBAL_TOOLKIT_REGISTRY) 2938 temp_mol.generate_conformers(n_conformers=10, toolkit_registry=toolkit_registry) -> 2939 temp_mol.compute_partial_charges_am1bcc(toolkit_registry=toolkit_registry) 2940 2941 # Assign charges to relevant atoms ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/topology/molecule.py in compute_partial_charges_am1bcc(self, toolkit_registry) 2012 charges = toolkit_registry.call( 2013 'compute_partial_charges_am1bcc', -> 2014 self 2015 ) 2016 elif isinstance(toolkit_registry, ToolkitWrapper): ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/utils/toolkits.py in call(self, method_name, *args, **kwargs) 3023 method = getattr(toolkit, method_name) 3024 try: -> 3025 return method(*args, **kwargs) 3026 except NotImplementedError: 3027 pass ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/utils/toolkits.py in compute_partial_charges_am1bcc(self, molecule) 1271 1272 if quacpac_status is False: -> 1273 raise Exception('Unable to assign charges') 1274 1275 # Extract and return charges Exception: Unable to assign charges **Computing environment:** _anaconda_depends 2019.03 py37_0 _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 1_llvm conda-forge alabaster 0.7.12 py_0 conda-forge amberlite 16.0 pypi_0 pypi ambertools 17.0 pypi_0 pypi anaconda custom py37_1 anaconda-client 1.7.2 py_0 conda-forge anaconda-project 0.8.4 py_0 asn1crypto 1.3.0 py37_0 conda-forge astroid 2.3.3 py37_1 conda-forge astropy 4.0 py37h516909a_1 conda-forge atomicwrites 1.3.0 py_0 conda-forge attrs 19.3.0 py_0 conda-forge babel 2.8.0 py_0 conda-forge backcall 0.1.0 py_0 conda-forge backports 1.0 py_2 conda-forge backports.os 0.1.1 py37_1001 conda-forge backports.shutil_get_terminal_size 1.0.0 py_3 conda-forge beautifulsoup4 4.8.2 py37_0 conda-forge bitarray 1.2.1 py37h516909a_0 conda-forge bkcharts 0.2 py37_0 blas 2.14 openblas conda-forge bleach 3.1.1 py_0 conda-forge blosc 1.17.1 he1b5a44_0 conda-forge bokeh 1.4.0 py37_0 conda-forge boost 1.70.0 py37h9de70de_1 conda-forge boost-cpp 1.70.0 h8e57a91_2 conda-forge boto 2.49.0 py_0 conda-forge bottleneck 1.3.1 py37hc1659b7_1 conda-forge bson 0.5.9 py_0 conda-forge bzip2 1.0.8 h516909a_2 conda-forge ca-certificates 2020.1.1 0 main cairo 1.16.0 hfb77d84_1002 conda-forge certifi 2019.11.28 py37_0 conda-forge cffi 1.14.0 py37h2e261b9_0 chardet 3.0.4 py37_1003 conda-forge click 7.0 py_0 conda-forge cloudpickle 1.3.0 py_0 conda-forge clyent 1.2.2 py_1 conda-forge codecov 2.0.15 py_1 conda-forge colorama 0.4.3 py_0 conda-forge conda 4.6.14 py37_0 contextlib2 0.6.0.post1 py_0 conda-forge coverage 5.0.3 py37h516909a_0 conda-forge cryptography 2.8 py37h72c5cf5_1 conda-forge curl 7.68.0 hbc83047_0 cycler 0.10.0 py_2 conda-forge cython 0.29.15 py37he1b5a44_0 conda-forge cytoolz 0.10.1 py37h516909a_0 conda-forge dask 2.11.0 py_0 conda-forge dask-core 2.11.0 py_0 conda-forge dbus 1.13.12 h746ee38_0 decorator 4.4.1 py_0 conda-forge defusedxml 0.6.0 py_0 conda-forge dill 0.3.1.1 pypi_0 pypi distributed 2.11.0 py37_0 conda-forge docutils 0.16 py37_0 conda-forge entrypoints 0.3 py37_1000 conda-forge et_xmlfile 1.0.1 py_1001 conda-forge expat 2.2.9 he1b5a44_2 conda-forge fastcache 1.1.0 py37h516909a_0 conda-forge fftw 3.3.8 nompi_h7f3a6c3_1110 conda-forge fftw3f 3.3.4 2 omnia filelock 3.0.12 py_0 flask 1.1.1 py_1 conda-forge fontconfig 2.13.1 h86ecdb6_1001 conda-forge freetype 2.10.0 he983fc9_1 conda-forge fribidi 1.0.5 h516909a_1002 conda-forge fsspec 0.6.2 py_0 conda-forge get_terminal_size 1.0.0 haa9412d_0 gevent 1.4.0 py37h516909a_0 conda-forge git 2.23.0 pl526hacde149_0 glib 2.63.1 h5a9c865_0 glob2 0.7 py_0 conda-forge gmp 6.2.0 he1b5a44_2 conda-forge gmpy2 2.1.0b1 py37h04dde30_0 conda-forge graphite2 1.3.13 hf484d3e_1000 conda-forge greenlet 0.4.15 py37h516909a_0 conda-forge gst-plugins-base 1.14.5 h0935bb2_2 conda-forge gstreamer 1.14.5 h36ae1b5_2 conda-forge h5py 2.10.0 nompi_py37h513d04c_102 conda-forge harfbuzz 2.4.0 h9f30f68_3 conda-forge hdf4 4.2.13 hf30be14_1003 conda-forge hdf5 1.10.5 nompi_h3c11f04_1104 conda-forge heapdict 1.0.1 py_0 conda-forge html5lib 1.0.1 py_0 conda-forge hypothesis 5.5.4 py_0 conda-forge icu 64.2 he1b5a44_1 conda-forge idna 2.9 py_1 conda-forge imageio 2.8.0 py_0 conda-forge imagesize 1.2.0 py_0 conda-forge importlib_metadata 1.5.0 py37_0 conda-forge intel-openmp 2019.4 243 ipykernel 5.1.4 py37h5ca1d4c_0 conda-forge ipython 7.12.0 py37h5ca1d4c_0 conda-forge ipython_genutils 0.2.0 py_1 conda-forge ipywidgets 7.5.1 py_0 conda-forge isort 4.3.21 py37_0 conda-forge itsdangerous 1.1.0 py_0 conda-forge jbig 2.1 h14c3975_2001 conda-forge jdcal 1.4.1 py_0 conda-forge jedi 0.16.0 py37_0 conda-forge jeepney 0.4.2 py_0 conda-forge jinja2 2.11.1 py_0 conda-forge joblib 0.14.1 py_0 conda-forge jpeg 9c h14c3975_1001 conda-forge json5 0.9.1 py_0 jsonschema 3.2.0 py37_0 conda-forge jupyter 1.0.0 py_2 conda-forge jupyter_client 5.3.4 py37_1 conda-forge jupyter_console 6.1.0 py_0 jupyter_core 4.6.3 py37_0 conda-forge jupyterlab 1.2.6 py_0 conda-forge jupyterlab_server 1.0.6 py_0 conda-forge keyring 21.1.0 py37_0 conda-forge kiwisolver 1.1.0 py37hc9558a2_0 conda-forge krb5 1.17.1 h2fd8d38_0 conda-forge lazy-object-proxy 1.4.3 py37h516909a_1 conda-forge ld_impl_linux-64 2.33.1 h53a641e_8 conda-forge libarchive 3.3.3 hc47fbbf_1007 conda-forge libblas 3.8.0 14_openblas conda-forge libcblas 3.8.0 14_openblas conda-forge libclang 9.0.1 default_hde54327_0 conda-forge libcurl 7.68.0 h20c2e04_0 libedit 3.1.20181209 hc058e9b_0 libffi 3.2.1 he1b5a44_1006 conda-forge libgcc 7.2.0 h69d50b8_2 conda-forge libgcc-ng 9.2.0 h24d8f2e_2 conda-forge libgfortran 3.0.0 1 conda-forge libgfortran-ng 7.3.0 hdf63c60_5 conda-forge libgomp 9.2.0 h24d8f2e_2 conda-forge libiconv 1.15 h516909a_1005 conda-forge liblapack 3.8.0 14_openblas conda-forge liblapacke 3.8.0 14_openblas conda-forge liblief 0.9.0 hf8a498c_1 conda-forge libllvm8 8.0.1 hc9558a2_0 conda-forge libllvm9 9.0.1 hc9558a2_0 conda-forge libnetcdf 4.7.3 nompi_h9f9fd6a_101 conda-forge libopenblas 0.3.7 h5ec1e0e_7 conda-forge libpng 1.6.37 hed695b0_0 conda-forge libsodium 1.0.17 h516909a_0 conda-forge libssh2 1.8.2 h22169c7_2 conda-forge libstdcxx-ng 9.2.0 hdf63c60_2 conda-forge libtiff 4.1.0 hc3755c2_3 conda-forge libtool 2.4.6 h14c3975_1002 conda-forge libuuid 2.32.1 h14c3975_1000 conda-forge libxcb 1.13 h14c3975_1002 conda-forge libxkbcommon 0.10.0 he1b5a44_0 conda-forge libxml2 2.9.10 hee79883_0 conda-forge libxslt 1.1.33 h31b3aaa_0 conda-forge llvm-openmp 9.0.1 hc9558a2_2 conda-forge llvmlite 0.31.0 py37h8b12597_0 conda-forge locket 0.2.0 py_2 conda-forge lxml 4.5.0 py37h7ec2d77_0 conda-forge lz4-c 1.8.3 he1b5a44_1001 conda-forge lzo 2.10 h14c3975_1000 conda-forge markupsafe 1.1.1 py37h516909a_0 conda-forge matplotlib 3.1.3 py37_0 conda-forge matplotlib-base 3.1.3 py37h250f245_0 conda-forge mccabe 0.6.1 py_1 conda-forge mdtraj 1.9.3 py37h00575c5_0 conda-forge mistune 0.8.4 py37h516909a_1000 conda-forge mkl 2019.5 281 conda-forge mkl-service 2.3.0 py37h516909a_0 conda-forge mkl_fft 1.1.0 py37hc1659b7_1 conda-forge mkl_random 1.1.0 py37hb3f55d8_0 conda-forge mmpbsa-py 16.0 pypi_0 pypi mock 4.0.1 py_0 more-itertools 8.2.0 py_0 conda-forge mpc 1.1.0 h04dde30_1006 conda-forge mpfr 4.0.2 he80fd80_0 conda-forge mpmath 1.1.0 py_0 conda-forge msgpack-python 1.0.0 py37hc9558a2_0 conda-forge multipledispatch 0.6.0 py_0 conda-forge multiprocess 0.70.9 pypi_0 pypi nbconvert 5.6.1 py37_0 conda-forge nbformat 5.0.4 py_0 conda-forge ncurses 6.1 hf484d3e_1002 conda-forge netcdf-fortran 4.5.2 nompi_h09cde99_103 conda-forge networkx 2.4 py_0 conda-forge nglview 2.7.1 pyh5ca1d4c_0 conda-forge nltk 3.4.5 py37_0 nose 1.3.7 py37_1003 conda-forge notebook 6.0.3 py37_0 conda-forge nspr 4.25 he1b5a44_0 conda-forge nss 3.47 he751ad9_0 conda-forge numba 0.48.0 py37hb3f55d8_0 conda-forge numexpr 2.7.1 py37hb3f55d8_0 conda-forge numpy 1.18.1 py37h95a1406_0 conda-forge numpy-base 1.18.1 py37h2f8d375_1 numpydoc 0.9.2 py_0 conda-forge olefile 0.46 py_0 conda-forge openbabel 3.0.0 py37hdef5451_1 conda-forge openeye-toolkits 2019.10.2 py37_0 openeye openforcefield 0.6.0 py37_1 omnia openforcefields 1.0.0 py37_0 omnia openmm 7.4.1 py37_cuda101_rc_1 omnia openmmforcefields 0.7.1 py37_2 omnia openmoltools 0.0.0.dev0 pypi_0 pypi openpyxl 3.0.3 py_0 conda-forge openssl 1.1.1d h516909a_0 conda-forge packaging 20.1 py_0 conda-forge packmol 1!18.013 0 omnia pandas 1.0.1 py37hb3f55d8_0 conda-forge pandoc 2.9.2 0 conda-forge pandocfilters 1.4.2 py_1 conda-forge pango 1.42.4 ha030887_1 conda-forge parmed 3.2.0 pypi_0 pypi parso 0.6.1 py_0 conda-forge partd 1.1.0 py_0 conda-forge patchelf 0.10 he1b5a44_0 conda-forge path 13.1.0 py37_0 conda-forge path.py 12.4.0 0 conda-forge pathlib2 2.3.5 py37_0 conda-forge pathos 0.2.5 pypi_0 pypi patsy 0.5.1 py_0 conda-forge pcre 8.44 he1b5a44_0 conda-forge pdb4amber 1.7.dev0 pypi_0 pypi pep8 1.7.1 py_0 conda-forge perl 5.26.2 h516909a_1006 conda-forge pexpect 4.8.0 py37_0 conda-forge pickleshare 0.7.5 py37_1000 conda-forge pillow 7.0.0 py37hefe7db6_0 conda-forge pip 20.0.2 py_2 conda-forge pixman 0.38.0 h516909a_1003 conda-forge pkginfo 1.5.0.1 py_0 conda-forge pluggy 0.13.1 py37_0 ply 3.11 py_1 conda-forge pox 0.2.7 pypi_0 pypi ppft 1.6.6.1 pypi_0 pypi prometheus_client 0.7.1 py_0 conda-forge prompt_toolkit 3.0.3 py_0 conda-forge psutil 5.7.0 py37h516909a_0 conda-forge pthread-stubs 0.4 h14c3975_1001 conda-forge ptyprocess 0.6.0 py_1001 conda-forge py 1.8.1 py_0 conda-forge py-lief 0.9.0 py37he1b5a44_1 conda-forge pycairo 1.19.1 py37h438ddbb_0 conda-forge pycodestyle 2.5.0 py_0 conda-forge pycosat 0.6.3 py37h516909a_1002 conda-forge pycparser 2.19 py_2 conda-forge pycrypto 2.6.1 py37h516909a_1003 conda-forge pycurl 7.43.0.5 py37h16ce93b_0 conda-forge pyflakes 2.1.1 py_0 conda-forge pygments 2.5.2 py_0 conda-forge pylint 2.4.4 py37_0 conda-forge pyodbc 4.0.30 py37he6710b0_0 pyopenssl 19.1.0 py_1 conda-forge pyparsing 2.4.6 py_0 conda-forge pyqt 5.12.3 py37hcca6a23_1 conda-forge pyqt5-sip 4.19.18 pypi_0 pypi pyqtwebengine 5.12.1 pypi_0 pypi pyrsistent 0.15.7 py37h516909a_0 conda-forge pysocks 1.7.1 py37_0 conda-forge pytables 3.6.1 py37h9f153d1_1 conda-forge pytest 5.3.5 py37_1 conda-forge pytest-arraydiff 0.3 py_0 conda-forge pytest-astropy 0.8.0 py_0 pytest-astropy-header 0.1.2 py_0 conda-forge pytest-cov 2.8.1 py_0 conda-forge pytest-doctestplus 0.5.0 py_0 pytest-openfiles 0.4.0 py_0 conda-forge pytest-remotedata 0.3.2 py37_0 python 3.7.6 h357f687_2 conda-forge python-dateutil 2.8.1 py_0 conda-forge python-libarchive-c 2.9 py37_0 conda-forge pytraj 2.0.5 pypi_0 pypi pytz 2019.3 py_0 conda-forge pywavelets 1.1.1 py37hc1659b7_0 conda-forge pyyaml 5.3 py37h516909a_0 conda-forge pyzmq 18.1.1 py37h1768529_0 conda-forge qt 5.12.5 hd8c4c69_1 conda-forge qtawesome 0.7.0 py_0 conda-forge qtconsole 4.6.0 py_0 conda-forge qtpy 1.9.0 py_0 conda-forge rdkit 2019.09.3 py37hb31dc5d_0 conda-forge readline 8.0 hf8c457e_0 conda-forge requests 2.23.0 py37_0 conda-forge rope 0.16.0 py_0 conda-forge ruamel_yaml 0.15.80 py37h516909a_1000 conda-forge sander 16.0 pypi_0 pypi scikit-image 0.16.2 py37hb3f55d8_0 conda-forge scikit-learn 0.22.1 py37hcdab131_1 conda-forge scipy 1.4.1 py37h921218d_0 conda-forge seaborn 0.10.0 py_1 conda-forge secretstorage 3.1.2 py37_0 conda-forge send2trash 1.5.0 py_0 conda-forge setuptools 45.2.0 py37_0 conda-forge simplegeneric 0.8.1 py_1 conda-forge singledispatch 3.4.0.3 py37_1000 conda-forge sip 4.19.20 py37he1b5a44_0 conda-forge six 1.14.0 py37_0 conda-forge smirnoff99frosst 1.1.0 py37_1 omnia snappy 1.1.8 he1b5a44_1 conda-forge snowballstemmer 2.0.0 py_0 conda-forge solvationtoolkit 0.4.4.dev0 pypi_0 pypi sortedcollections 1.1.2 py_0 conda-forge sortedcontainers 2.1.0 py_0 conda-forge soupsieve 1.9.5 py37_0 sphinx 2.4.2 py_0 conda-forge sphinxcontrib 1.0 py37_1 sphinxcontrib-applehelp 1.0.1 py_0 conda-forge sphinxcontrib-devhelp 1.0.1 py_0 conda-forge sphinxcontrib-htmlhelp 1.0.2 py_0 conda-forge sphinxcontrib-jsmath 1.0.1 py_0 conda-forge sphinxcontrib-qthelp 1.0.2 py_0 conda-forge sphinxcontrib-serializinghtml 1.1.3 py_0 conda-forge sphinxcontrib-websupport 1.2.0 py_0 spyder 3.3.6 py37_1 conda-forge spyder-kernels 0.5.2 py37_0 conda-forge sqlalchemy 1.3.13 py37h516909a_0 conda-forge sqlite 3.31.1 h7b6447c_0 statsmodels 0.11.0 py37h516909a_0 conda-forge stk 2019.10.21.0 pypi_0 pypi sympy 1.5.1 py37_1 conda-forge tblib 1.6.0 py_0 conda-forge terminado 0.8.3 py37_0 conda-forge testpath 0.4.4 py_0 conda-forge tinydb 3.15.2 py_0 conda-forge tk 8.6.10 hed695b0_0 conda-forge toml 0.10.0 py_0 conda-forge toolz 0.10.0 py_0 conda-forge tornado 6.0.3 py37h516909a_4 conda-forge tqdm 4.43.0 py_0 conda-forge traitlets 4.3.3 py37_0 conda-forge typed-ast 1.4.1 py37h516909a_0 conda-forge unicodecsv 0.14.1 py_1 conda-forge unixodbc 2.3.7 h227dcee_1000 conda-forge urllib3 1.25.8 py37_0 wcwidth 0.1.8 py_0 conda-forge webencodings 0.5.1 py_1 conda-forge werkzeug 1.0.0 py_0 conda-forge wheel 0.34.2 py_1 conda-forge widgetsnbextension 3.5.1 py37_0 conda-forge wrapt 1.12.0 py37h516909a_0 conda-forge wurlitzer 2.0.0 py37_0 conda-forge xlrd 1.2.0 py_0 conda-forge xlsxwriter 1.2.7 py_0 conda-forge xlwt 1.3.0 py_1 conda-forge xmltodict 0.12.0 py_0 conda-forge xorg-kbproto 1.0.7 h14c3975_1002 conda-forge xorg-libice 1.0.10 h516909a_0 conda-forge xorg-libsm 1.2.3 h84519dc_1000 conda-forge xorg-libx11 1.6.9 h516909a_0 conda-forge xorg-libxau 1.0.9 h14c3975_0 conda-forge xorg-libxdmcp 1.1.3 h516909a_0 conda-forge xorg-libxext 1.3.4 h516909a_0 conda-forge xorg-libxrender 0.9.10 h516909a_1002 conda-forge xorg-libxt 1.2.0 h516909a_0 conda-forge xorg-renderproto 0.11.1 h14c3975_1002 conda-forge xorg-xextproto 7.3.0 h14c3975_1002 conda-forge xorg-xproto 7.0.31 h14c3975_1007 conda-forge xz 5.2.4 h14c3975_1001 conda-forge yaml 0.2.2 h516909a_1 conda-forge zeromq 4.3.2 he1b5a44_2 conda-forge zict 1.0.0 py_0 conda-forge zipp 2.2.1 py_0 conda-forge zlib 1.2.11 h516909a_1006 conda-forge zstd 1.4.4 h3b9ef0a_1 conda-forge **Additional context** We tried to circumvent this problem by pre-calculating the charges using openeye and changing the Maxatoms=… within the am1bcccharges method (not possible for the 10ELF variant). We were unable to finish this calculation, as this is not scaling very well (so it seems). Furthermore, we did not find any other solutions to solve this problem up to now. Could you perhaps help us out with the following questions? - Can we define a new library residue so this can be recognized within our molecule? - Are there alternative ways to fragmentate the polymer so charge calculations do not have to be performed for the molecule itself but for the units it consists of? At the moment we are thinking of solving everything by breaking down the polymer into its three different parts: the initiator, the terminator and the monomer units. We then perform calculations with 10 monomer units, which the software can handle. We extract the computed charges with compute_partial_charges(self, molecule[, …]) per part. We add the user defined partial charges upon system creation in a list for the three species, which we provide to openforcefield as library species? Alternatively use: create_openmm_system(topology, charge_from_molecules=molecule_list)? Thank you in advance for the help! Looking forward to hear back from you.

j-wags commented 4 years ago

Hi @kcreemer,

Thanks for the thoroughly-written issue.

Are there alternative ways to fragmentate the polymer so charge calculations do not have to be performed for the molecule itself but for the units it consists of?

We're planning to develop a workflow to automatically fragment and charge polymers like this. This workflow would allow users to define a protein force field, a residue separator SMARTS, and a method for treating nonstandard amino acids and capping groups. Unfortunately, we have several more pressing features on our roadmap, so I can't say whether this will be ready within a year. In the short-term, you're probably best off defining LibraryCharges for your polymer, using the 10-mer approach that you mentioned above.

Can we define a new library residue so this can be recognized within our molecule?

Yes, and this is the approach I would suggest. Use examples for the LibraryCharges tag can be found in the SMIRNOFF spec documentation. This will require writing three SMARTS strings by hand (which can be a bit painful), but once it works, it should suffice for polymers of arbitrary length. As you mentioned, the three LibraryCharge elements that would need to be defined are the initiator, terminator, and monomer unit. If writing the SMARTS is prohibitively difficult, we provide functionality in Chemper for machine-generating SMARTS patterns, however it is not as geared for external use as the OFF Toolkit, so the installation/documentation may not be as mature.

Alternatively use: create_openmm_system(topology, charge_from_molecules=molecule_list)?

This would also work, though you'd need to manually update offmol.partial_charges with your desired charges ahead of time. I'd be cautious about this, just because the workflow would need to assign a large number of charges in exactly the right order to match up with the atom indexing. If, for any reason, the molecule atom indexing changed or was misinterpreted, the partial charges assigned to the resulting system would silently be scrambled. For this reason, I'd recommend using LibraryCharges, since it ensures that charges are assigned based on the connectivity of the model.

Let's leave this issue open until we get a solution working for you. Please let me know if there's anything else I can do to help.

kcreemer commented 4 years ago

Dear Mr. Wagner,

Thank you for your fast reply and clear explanation.

Other questions arose now. Where can the LibraryCharges be defined? Does this require smirks notation of each part of the polymer as in the example you provided? Or does SMARTS suffice, as you suggest in your mail.

Thanks in advance.

Kind regards, Karolien Creemers

Op di 25 feb. 2020 om 17:44 schreef Jeff Wagner notifications@github.com:

Hi @kcreemer https://github.com/kcreemer,

Thanks for the thoroughly-written issue.

Are there alternative ways to fragmentate the polymer so charge calculations do not have to be performed for the molecule itself but for the units it consists of?

We're planning to develop a workflow to automatically fragment and charge polymers like this. This workflow would allow users to define a protein force field, a residue separator SMARTS, and a method for treating nonstandard amino acids and capping groups. Unfortunately, we have several more pressing features on our roadmap, so I can't say whether this will be ready within a year. In the short-term, you're probably best off defining LibraryCharges for your polymer, using the 10-mer approach that you mentioned above.

Can we define a new library residue so this can be recognized within our molecule?

Yes, and this is the approach I would suggest. Use examples for the LibraryCharges tag can be found in the SMIRNOFF spec documentation https://open-forcefield-toolkit.readthedocs.io/en/0.6.0/smirnoff.html#librarycharges-library-charges-for-polymeric-residues-and-special-solvent-models. This will require writing three SMARTS strings by hand (which can be a bit painful), but once it works, it should suffice for polymers of arbitrary length. As you mentioned, the three LibraryCharge elements that would need to be defined are the initiator, terminator, and monomer unit. If writing the SMARTS is prohibitively difficult, we provide functionality in Chemper https://github.com/mobleylab/chemper for machine-generating SMARTS patterns, however it is not as geared for external use as the OFF Toolkit, so the installation/documentation may not be as mature.

Alternatively use: create_openmm_system(topology, charge_from_molecules=molecule_list)?

This would also work, though you'd need to manually update offmol.partial_charges with your desired charges ahead of time. I'd be cautious about this, just because the workflow would need to assign a large number of charges in exactly the right order to match up with the atom indexing. If, for any reason, the molecule atom indexing changed or was misinterpreted, the partial charges assigned to the resulting system would silently be scrambled. For this reason, I'd recommend using LibraryCharges, since it ensures that charges are assigned based on the connectivity of the model.

Let's leave this issue open until we get a solution working for you. Please let me know if there's anything else I can do to help.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openforcefield/openforcefield/issues/528?email_source=notifications&email_token=ANQHY5VHQBOS5LEDUMV3LCTREVDFTA5CNFSM4K3JBCG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEM4T4VA#issuecomment-590954068, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANQHY5QRT5F6HVUTGUWGCKLREVDFTANCNFSM4K3JBCGQ .

j-wags commented 4 years ago

Where can the LibraryCharges be defined?

LibraryCharges can be defined in a force field file (OFFXML format). Here's an example of what one looks like.

https://github.com/openforcefield/openforcefield/blob/master/openforcefield/tests/test_forcefield.py#L180-L191

To use this in practice, you'll want to load a complete force field (with bonds and angles and such), and then ALSO load the LibraryCharges OFFXML file (or string) that provides charges for the molecule of interest.

Here is an example of how to use LibraryCharges. A ForceField object can load multiple data sources, so the example below first loads our "Parsley" force field ("openff-1.0.0.offxml"), and then the LibraryCharges FF that I linked above.

from openforcefield.topology import Molecule
from openforcefield.typing.engines.smirnoff import ForceField

xml_ethanol_library_charges_by_atom_ff = '''
<SMIRNOFF version="0.3" aromaticity_model="OEAroModel_MDL">
    <LibraryCharges version="0.3">
       <LibraryCharge smirks="[#1:1]-[#6]" charge1="-0.02*elementary_charge" />
       <LibraryCharge smirks="[#6X4:1]" charge1="-0.2*elementary_charge" />
       <LibraryCharge smirks="[#1:1]-[#6]-[#8]" charge1="-0.01*elementary_charge" />
       <LibraryCharge smirks="[#6X4:1]-[#8]" charge1="-0.1*elementary_charge" />
       <LibraryCharge smirks="[#8X2:1]" charge1="0.3*elementary_charge" />
       <LibraryCharge smirks="[#1:1]-[#8]" charge1="0.08*elementary_charge" />
    </LibraryCharges>
</SMIRNOFF>
'''
ethanol = Molecule.from_smiles('CCO')
topology = ethanol.to_topology()
ff = ForceField('openff-1.0.0.offxml', xml_ethanol_library_charges_by_atom_ff)
system = ff.create_openmm_system(topology)

Note that the charge values here are just used for testing -- They don't make any physical sense. But we can check over the system this creates and verify that the charges were assigned to the different atoms correctly:

for index in range(ethanol.n_atoms):
    print('mass:', system.getParticleMass(index), 'charge:', system.getForce(2).getParticleParameters(index)[0])

mass: 12.01078 Da charge: -0.2 e mass: 12.01078 Da charge: -0.1 e mass: 15.99943 Da charge: 0.3 e mass: 1.007947 Da charge: -0.02 e mass: 1.007947 Da charge: -0.02 e mass: 1.007947 Da charge: -0.02 e mass: 1.007947 Da charge: -0.01 e mass: 1.007947 Da charge: -0.01 e mass: 1.007947 Da charge: 0.08 e

Does this require smirks notation of each part of the polymer as in the example you provided? Or does SMARTS suffice, as you suggest in your mail.

We use SMARTS. Somehow we made a typo early on and said "SMIRKS" when we meant to say "SMARTS", and then the whole joke with "SMIRNOFF" got started, and "SMARNOFF" wouldn't be very funny, so some of our older documentation says "SMIRKS" when we really mean "SMARTS". We're trying to correct this everywhere we can. To the best of my knowledge, our software only ever uses SMARTS.

kcreemer commented 4 years ago

Dear Mr. Wagner,

We managed to compute the charges for the species (monomer, initiator and terminator) and create an offxml file with the LibraryCharges. It consists out of three lines: one line for the initiator, one line for the monomer and one line for the terminator after which the charges are defined, as can be seen below.

xml_pethoxtot_library_charges_ff = '''

'''

For our application, the initiator and terminator have the same structure but different partial charges. Hence we named each line in the offxml file. In the molecule, we assigned a residue name to the atoms (column 'resName' in the topology). We successfully could create a system with 10 monomers and check the partial charges afterwards as you mentioned above. Unfortunately, the system fails for 50 monomers. So once the number of atoms exceeds a certain limit (which I believe is 300 for both antechambers and openeye), the 'Unable to assign charges' error appears again. This is however strange as the charges are taken from the LibraryCharges offxml file. It is however possible to change the maximum allowed number of atoms in the case of open eye (oequacpac.OEAssignCharges(oemol, oequacpac. OEAM1BCCCharges(maxAtoms=1400))) but as this method should not be called in the first place we wonder which is causing the error.

As a suggestion, maybe an extra argument can be added to the ff.create_openmm_system function, which indicates the maximum number of atoms?

Furthermore, we are unsure about how the partial charges are assigned upon creation of the system with the command ff.create_openmm_system(topology). Because there does not seem to be a link with the assigned residue names, but with the SMIRKS notation instead. Is it possible to link the calculated LibraryCharges with the molecule by the 'resName' in the topology?

Thank you for your help.

Kind regards, Karolien Creemers

CODE: off_forcefield = ForceField('openff-1.0.0.offxml', 'xml_pethoxtot_library_chargesff.offxml') polymer = app.PDBFile(f"polymer{number_of_units}units_res.pdb") uniq_molecules = [Molecule.from_smiles(polymer_smiles)] off_polymer_topology = Topology.from_openmm(polymer.topology,unique_molecules=uniq_molecules) off_polymer_system = off_forcefield.create_openmm_system(off_polymer_topology)

ERROR: Exception Traceback (most recent call last)

in 10 off_polymer_topology = Topology.from_openmm(polymer.topology,unique_molecules=uniq_molecules) 11 print(off_polymer_topology) ---> 12 off_polymer_system = off_forcefield.create_openmm_system(off_polymer_topology) ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/typing/engines/smirnoff/forcefield.py in create_openmm_system(self, topology, **kwargs) 1136 # Add forces and parameters to the System 1137 for parameter_handler in parameter_handlers: -> 1138 parameter_handler.create_force(system, topology, **kwargs) 1139 1140 # Let force Handlers do postprocessing ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/typing/engines/smirnoff/parameters.py in create_force(self, system, topology, **kwargs) 2937 toolkit_registry = kwargs.get('toolkit_registry', GLOBAL_TOOLKIT_REGISTRY) 2938 temp_mol.generate_conformers(n_conformers=10, toolkit_registry=toolkit_registry) -> 2939 temp_mol.compute_partial_charges_am1bcc(toolkit_registry=toolkit_registry) 2940 2941 # Assign charges to relevant atoms ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/topology/molecule.py in compute_partial_charges_am1bcc(self, toolkit_registry) 2012 charges = toolkit_registry.call( 2013 'compute_partial_charges_am1bcc', -> 2014 self 2015 ) 2016 elif isinstance(toolkit_registry, ToolkitWrapper): ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/utils/toolkits.py in call(self, method_name, *args, **kwargs) 3023 method = getattr(toolkit, method_name) 3024 try: -> 3025 return method(*args, **kwargs) 3026 except NotImplementedError: 3027 pass ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/utils/toolkits.py in compute_partial_charges_am1bcc(self, molecule) 1271 1272 if quacpac_status is False: -> 1273 raise Exception('Unable to assign charges') 1274 1275 # Extract and return charges Exception: Unable to assign charges Op wo 26 feb. 2020 om 17:36 schreef Jeff Wagner : > Where can the LibraryCharges be defined? > > LibraryCharges can be defined in a force field file (OFFXML format). > Here's an example of what one looks like. > > > https://github.com/openforcefield/openforcefield/blob/e03091dd9e063675707c64f9538cd715ecfb7ccb/openforcefield/tests/test_forcefield.py#L170-L178 > > To use this in practice, you'll want to load a complete force field (with > bonds and angles and such), and then ALSO load the LibraryCharges OFFXML > file (or string) that provides charges for the molecule of interest. > > Here is an example of how to use LibraryCharges. A ForceField object can > load multiple data sources, so the example below first loads our "Parsley" > force field ("openff-1.0.0.offxml" > ), > and then the LibraryCharges FF that I linked above. > > from openforcefield.topology import Molecule > from openforcefield.typing.engines.smirnoff import ForceField > > xml_ethanol_library_charges_by_atom_ff = ''' > > > > > > > > > > > ''' > ethanol = Molecule.from_smiles('CCO') > topology = ethanol.to_topology() > ff = ForceField('openff_unconstrained-1.0.0.offxml', xml_ethanol_library_charges_by_atom_ff) > system = ff.create_openmm_system(topology) > > Note that the charge values here are just used for testing -- They don't > make any physical sense. But we can check over the system this creates and > verify that the charges were assigned to the different atoms correctly: > > for index in range(ethanol.n_atoms): > print('mass:', system.getParticleMass(index), 'charge:', system.getForce(2).getParticleParameters(index)[0]) > > mass: 12.01078 Da charge: -0.2 e > mass: 12.01078 Da charge: -0.1 e > mass: 15.99943 Da charge: 0.3 e > mass: 1.007947 Da charge: -0.02 e > mass: 1.007947 Da charge: -0.02 e > mass: 1.007947 Da charge: -0.02 e > mass: 1.007947 Da charge: -0.01 e > mass: 1.007947 Da charge: -0.01 e > mass: 1.007947 Da charge: 0.08 e > > Does this require smirks notation of each part of the polymer as in the > example you provided? Or does SMARTS suffice, as you suggest in your mail. > > We use SMARTS. Somehow we made a typo early on and said "SMIRKS" when we > meant to say "SMARTS", and then the whole joke with "SMIRNOFF" got started, > and "SMARNOFF" wouldn't be very funny, so some of our older documentation > says "SMIRKS" when we really mean "SMARTS". We're trying to correct this > everywhere we can. To the best of my knowledge, our software only ever uses > SMARTS. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , > or unsubscribe > > . >

j-wags commented 4 years ago

Thanks for reporting back!

One note: Your initiator and terminator SMARTS expressions are the same, so only the second one will ever match (so "INE" is never applied, "CTR" applies to both ends). You can fix this by expanding the initiator and terminator SMARTS to include some non-tagged atoms, but which include enough chemical environment to distinguish the beginning from the end.

As for the charge assignment problem: I'm optimistic that #509 will have fixed the max-size issue. This fix will come out in our 0.7.0 release, scheduled for early April. Basically, both RDKit and OpenEye toolkits have a maximum number of SMARTS matches they'll return, which we've found we need to increase when dealing with larger molecules. You can apply a manual patch right now to see if that fixes the issue for you: In your~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/utils/toolkits.py, apply the following change:

https://github.com/openforcefield/openforcefield/pull/509/files#diff-c854e3299a958176be6dac43c8f3213eR2444-R2448

Please let me know if that helps, otherwise we can try to figure out what else might be happening.

kcreemer commented 4 years ago

Dear Mr. Wagner,

I created a unique SMARTS notation for the three different residues and added them together with the corresponding charges to the LibraryCharges offxml file. By adapting the toolkits.py file, the above mentioned error did disappear. So we were able to parameterize the polymer correctly! Many thanks for your help and patience. We are currently setting up a few test systems. Hopefully everything will run smoothly from now on.

Kind regards, Karolien Creemers

Op do 12 mrt. 2020 om 16:56 schreef Jeff Wagner notifications@github.com:

Thanks for reporting back!

One note: Your initiator and terminator SMARTS expressions are the same, so only the second one will ever match (so "INE" is never applied, "CTR" applies to both ends). You can fix this by expanding the initiator and terminator SMARTS to include some non-tagged atoms, but which include enough chemical environment to distinguish the beginning from the end.

As for the charge assignment problem: I'm optimistic that #509 https://github.com/openforcefield/openforcefield/pull/509 will have fixed the max-size issue. This fix will come out in our 0.7.0 release, scheduled for early April. Basically, both RDKit and OpenEye toolkits have a maximum number of SMARTS matches they'll return, which we've found we need to increase when dealing with larger molecules. You can apply a manual patch right now to see if that fixes the issue for you: In your ~/anaconda3/envs/myenv/lib/python3.7/site-packages/openforcefield/utils/toolkits.py, apply the following change:

https://github.com/openforcefield/openforcefield/pull/509/files#diff-c854e3299a958176be6dac43c8f3213eR2444-R2448

Please let me know if that helps, otherwise we can try to figure out what else might be happening.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openforcefield/openforcefield/issues/528#issuecomment-598266694, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANQHY5TJQT6CMANHDCFFJXDRHEA35ANCNFSM4K3JBCGQ .

openforcefield / openff-toolkit

Assigning partial charges for polymers/parameterizing polymers #528