Optimizations failing on large macromolecule

Yoshanuikabundi commented 1 year ago

Description

During the BespokeFit followup workshop, @rvkrishnan30 pointed out that the S1A ligand takes >24 hours during ForceBalance. I haven't been able to reproduce this yet because my XTB calculations are slow for some reason but I will soon.

Reproduction

I'm attempting to reproduce in this gist

Output

@rvkrishnan30 any details you could provide on this would be great :)

Software versions

Which operating system and version are you using?
How did you install BespokeFit?
What is the output of running conda list?

rvkrishnan30 commented 1 year ago

Hi @Yoshanuikabundi , as suggested during the follow-up workshop, I loosened the convergence criteria for ForceBalance to check if I get any speed improvement without loss of accuracy.

By changing both convergence objective and convergence step from 0.01 to 1.00, the speed improved from ~40 hrs to ~19 hrs without much loss in accuracy as from the K values and the resultant torsion plots.

I am sure even this can be further improved in terms of speed without issues and suspect something is taking much longer than it has to slowing down the entire FB fitting process. I see this kind of a behaviour with a few other unrelated ligand molecules too.

Could you please suggest if there is a better way to debug and narrow down to what's causing the issue, and also possibly what other parameters can be tuned to avoid this behaviour.

Below is my example input file passed to ForceBalance.

$options ffdir forcefield penalty_type L1 jobtype optimize forcefield force-field.offxml

maxstep 10 convergence_step 1.00 convergence_objective 1.00 convergence_gradient 0.01 criteria 2 eig_lowerbound 0.01 finite_difference_h 0.01 penalty_additive 1.0

trust0 -0.25 mintrust 0.05 error_tolerance 1.0 adaptive_factor 0.2 adaptive_damping 1.0 normalize_weights False constrain_charge false

priors ProperTorsions/Proper/k : 6.0 /priors

$end

$target name ligand_fragment_around_22_24 weight 1.0

type TorsionProfile_SMIRNOFF mol2 ligand_fragment_around_22_24.sdf pdb ligand_fragment_around_22_24.pdb coords scan.xyz writelevel 2 attenuate 1

energy_denom 1.0 energy_upper 10.0 $end

j-wags commented 1 year ago

@rvkrishnan30 I'm getting to this today, but I'm unable to reproduce the slow runtime on this molecule - it just ran using an openeye-less env in a little over 1 hour (20 minutes fragmentation, 20 minutes QC generation, 20 minutes parameter optimization) on my macbook air.

Could you upload the complete forcebalance inputs (sdf, pdb, scan.xyz, force-field.offxml) so I can see if that reproduces?

My whole environment is:

Full env

``` (bespokefit) jw@mba$ conda list # packages in environment at /Users/jeffreywagner/conda/envs/bespokefit: # # Name Version Build Channel amberlite 16.0 pypi_0 pypi ambertools 21.11 py39hf80593e_0 conda-forge amberutils 21.0 pypi_0 pypi amqp 5.1.1 pyhd8ed1ab_0 conda-forge anyio 3.6.2 pyhd8ed1ab_0 conda-forge appnope 0.1.3 pyhd8ed1ab_0 conda-forge argcomplete 1.12.3 pyhd8ed1ab_0 conda-forge argon2-cffi 21.3.0 pyhd8ed1ab_0 conda-forge argon2-cffi-bindings 21.2.0 py39ha30fb19_3 conda-forge arpack 3.7.0 hefb7bc6_2 conda-forge arrow-cpp 9.0.0 py39h7d8c460_10_cpu conda-forge asttokens 2.1.0 pyhd8ed1ab_0 conda-forge astunparse 1.6.3 pyhd8ed1ab_0 conda-forge async-timeout 4.0.2 pyhd8ed1ab_0 conda-forge attrs 22.1.0 pyh71513ae_1 conda-forge aws-c-cal 0.5.11 hd2e2f4b_0 conda-forge aws-c-common 0.6.2 h0d85af4_0 conda-forge aws-c-event-stream 0.2.7 hb9330a7_13 conda-forge aws-c-io 0.10.5 h35aa462_0 conda-forge aws-checksums 0.1.11 h0010a65_7 conda-forge aws-sdk-cpp 1.8.186 h7c85d8e_4 conda-forge backcall 0.2.0 pyh9f0ad1d_0 conda-forge backports 1.0 py_2 conda-forge backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge basis_set_exchange 0.9 pyhd8ed1ab_0 conda-forge beautifulsoup4 4.11.1 pyha770c72_0 conda-forge billiard 3.6.4.0 py39ha30fb19_3 conda-forge bleach 5.0.1 pyhd8ed1ab_0 conda-forge blosc 1.21.1 h97e831e_3 conda-forge boost 1.74.0 py39ha1f3e3e_5 conda-forge boost-cpp 1.74.0 h8b082ac_8 conda-forge brotli 1.0.9 hb7f2c08_8 conda-forge brotli-bin 1.0.9 hb7f2c08_8 conda-forge brotlipy 0.7.0 py39ha30fb19_1005 conda-forge bzip2 1.0.8 h0d85af4_4 conda-forge c-ares 1.18.1 h0d85af4_0 conda-forge ca-certificates 2022.9.24 h033912b_0 conda-forge cached-property 1.5.2 hd8ed1ab_1 conda-forge cached_property 1.5.2 pyha770c72_1 conda-forge cachetools 5.2.0 pyhd8ed1ab_0 conda-forge cairo 1.16.0 h904041c_1014 conda-forge celery 5.2.7 pyhd8ed1ab_0 conda-forge certifi 2022.9.24 pyhd8ed1ab_0 conda-forge cffi 1.15.1 py39h131948b_2 conda-forge charset-normalizer 2.1.1 pyhd8ed1ab_0 conda-forge chemper 1.0.1 pyhd8ed1ab_0 conda-forge click 8.1.3 unix_pyhd8ed1ab_2 conda-forge click-didyoumean 0.3.0 pyhd8ed1ab_0 conda-forge click-option-group 0.5.3 pyhd8ed1ab_1 conda-forge click-plugins 1.1.1 py_0 conda-forge click-repl 0.2.0 pyhd8ed1ab_0 conda-forge colorama 0.4.6 pyhd8ed1ab_0 conda-forge commonmark 0.9.1 py_0 conda-forge conda 22.9.0 py39h6e9494a_2 conda-forge conda-package-handling 1.9.0 py39ha30fb19_1 conda-forge contourpy 1.0.6 py39h92daf61_0 conda-forge cryptography 38.0.3 py39h7eb6a14_0 conda-forge curl 7.86.0 h57eb407_1 conda-forge cycler 0.11.0 pyhd8ed1ab_0 conda-forge cython 0.29.32 py39h7a8716b_1 conda-forge dataclasses 0.8 pyhc8e2a94_3 conda-forge debugpy 1.6.3 py39h7a8716b_1 conda-forge decorator 5.1.1 pyhd8ed1ab_0 conda-forge defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge deprecated 1.2.13 pyh6c4a22f_0 conda-forge entrypoints 0.4 pyhd8ed1ab_0 conda-forge executing 1.2.0 pyhd8ed1ab_0 conda-forge expat 2.5.0 hf0c8a7f_0 conda-forge fastapi 0.87.0 pyhd8ed1ab_0 conda-forge fftw 3.3.10 nompi_h4fa670e_105 conda-forge flit-core 3.8.0 pyhd8ed1ab_0 conda-forge fmt 9.1.0 hb8565cd_0 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 hab24e00_0 conda-forge fontconfig 2.14.1 h5bb23bf_0 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge fonttools 4.38.0 py39ha30fb19_1 conda-forge forcebalance 1.9.3 py39h6dc771e_2 conda-forge freetype 2.12.1 h3f81eb7_0 conda-forge future 0.18.2 pyhd8ed1ab_6 conda-forge geometric 1.0 pyhd8ed1ab_0 conda-forge gettext 0.21.1 h8a4c099_0 conda-forge gflags 2.2.2 hb1e8313_1004 conda-forge glog 0.6.0 h8ac2a54_0 conda-forge greenlet 2.0.1 py39h7a8716b_0 conda-forge h11 0.12.0 pyhd8ed1ab_0 conda-forge h2 4.1.0 pyhd8ed1ab_0 conda-forge h5py 3.1.0 nompi_py39hdc2b67d_100 conda-forge hdf4 4.2.15 h7aa5921_5 conda-forge hdf5 1.10.6 nompi_hc5d9132_1114 conda-forge hpack 4.0.0 pyh9f0ad1d_0 conda-forge httpcore 0.15.0 pyhd8ed1ab_0 conda-forge httpx 0.23.0 py39h6e9494a_2 conda-forge hyperframe 6.0.1 pyhd8ed1ab_0 conda-forge icu 70.1 h96cf925_0 conda-forge idna 3.4 pyhd8ed1ab_0 conda-forge importlib-metadata 5.0.0 pyha770c72_1 conda-forge importlib_resources 5.10.0 pyhd8ed1ab_0 conda-forge ipykernel 6.17.1 pyh736e0ef_0 conda-forge ipython 8.6.0 pyhd1c38e8_1 conda-forge ipython_genutils 0.2.0 py_1 conda-forge ipywidgets 8.0.2 pyhd8ed1ab_1 conda-forge jedi 0.18.1 pyhd8ed1ab_2 conda-forge jinja2 3.1.2 pyhd8ed1ab_1 conda-forge jpeg 9e hac89ed1_2 conda-forge jsonschema 4.17.0 pyhd8ed1ab_0 conda-forge jupyter_client 7.4.6 pyhd8ed1ab_0 conda-forge jupyter_core 5.0.0 py39h6e9494a_0 conda-forge jupyter_server 1.23.2 pyhd8ed1ab_0 conda-forge jupyterlab_pygments 0.2.2 pyhd8ed1ab_0 conda-forge jupyterlab_widgets 3.0.3 pyhd8ed1ab_0 conda-forge khronos-opencl-icd-loader 2022.09.30 hb7f2c08_2 conda-forge kiwisolver 1.4.4 py39h92daf61_1 conda-forge kombu 5.2.4 py39h6e9494a_2 conda-forge krb5 1.19.3 hb49756b_0 conda-forge lcms2 2.14 h90f4b2a_0 conda-forge lerc 4.0.0 hb486fe8_0 conda-forge libabseil 20220623.0 cxx17_h844d122_5 conda-forge libarchive 3.5.2 hde4784d_3 conda-forge libblas 3.9.0 16_osx64_openblas conda-forge libbrotlicommon 1.0.9 hb7f2c08_8 conda-forge libbrotlidec 1.0.9 hb7f2c08_8 conda-forge libbrotlienc 1.0.9 hb7f2c08_8 conda-forge libcblas 3.9.0 16_osx64_openblas conda-forge libcrc32c 1.1.2 he49afe7_0 conda-forge libcurl 7.86.0 h57eb407_1 conda-forge libcxx 14.0.6 hccf4f1f_0 conda-forge libdeflate 1.14 hb7f2c08_0 conda-forge libedit 3.1.20191231 h0678c8f_2 conda-forge libev 4.33 haf1e3a3_1 conda-forge libevent 2.1.10 h815e4d9_4 conda-forge libffi 3.4.2 h0d85af4_5 conda-forge libgfortran 5.0.0 9_5_0_h97931a8_26 conda-forge libgfortran5 11.3.0 h082f757_26 conda-forge libglib 2.74.1 h4c723e1_1 conda-forge libgoogle-cloud 2.3.0 hb6a50ef_1 conda-forge libgrpc 1.49.1 h834a566_1 conda-forge libiconv 1.17 hac89ed1_0 conda-forge liblapack 3.9.0 16_osx64_openblas conda-forge libmamba 1.0.0 h2bf831e_2 conda-forge libmambapy 1.0.0 py39he069e75_2 conda-forge libnetcdf 4.8.1 nompi_hb4d10b0_100 conda-forge libnghttp2 1.47.0 h7cbc4dc_1 conda-forge libopenblas 0.3.21 openmp_h429af6e_3 conda-forge libpng 1.6.38 ha978bb4_0 conda-forge libprotobuf 3.21.9 hbc0c0cd_0 conda-forge libsodium 1.0.18 hbcb3906_1 conda-forge libsolv 0.7.22 hd9580d2_0 conda-forge libsqlite 3.39.4 ha978bb4_0 conda-forge libssh2 1.10.0 h7535e13_3 conda-forge libthrift 0.16.0 h08c06f4_2 conda-forge libtiff 4.4.0 hdb44e8a_4 conda-forge libutf8proc 2.8.0 hb7f2c08_0 conda-forge libwebp-base 1.2.4 h775f41a_0 conda-forge libxcb 1.13 h0d85af4_1004 conda-forge libxml2 2.10.3 hb9e07b5_0 conda-forge libxslt 1.1.37 h5d22bc9_0 conda-forge libzip 1.9.2 h3ad4413_1 conda-forge libzlib 1.2.13 hfd90126_4 conda-forge llvm-openmp 15.0.4 h61d9ccf_0 conda-forge lxml 4.9.1 py39hfbce9ca_1 conda-forge lz4-c 1.9.3 he49afe7_1 conda-forge lzo 2.10 haf1e3a3_1000 conda-forge mamba 1.0.0 py39ha435c47_2 conda-forge markupsafe 2.1.1 py39ha30fb19_2 conda-forge matplotlib-base 3.6.2 py39hb2f573b_0 conda-forge matplotlib-inline 0.1.6 pyhd8ed1ab_0 conda-forge mctc-lib 0.3.1 h8dc6bf7_0 conda-forge mdtraj 1.9.7 py39h5456c6e_4 conda-forge mistune 2.0.4 pyhd8ed1ab_0 conda-forge mmpbsa-py 16.0 pypi_0 pypi mock 4.0.3 pyhd8ed1ab_4 conda-forge msgpack-python 1.0.4 py39h92daf61_1 conda-forge munkres 1.1.4 pyh9f0ad1d_0 conda-forge nbclassic 0.4.8 pyhd8ed1ab_0 conda-forge nbclient 0.7.0 pyhd8ed1ab_0 conda-forge nbconvert 7.2.5 pyhd8ed1ab_0 conda-forge nbconvert-core 7.2.5 pyhd8ed1ab_0 conda-forge nbconvert-pandoc 7.2.5 pyhd8ed1ab_0 conda-forge nbformat 5.7.0 pyhd8ed1ab_0 conda-forge ncurses 6.3 h96cf925_1 conda-forge nest-asyncio 1.5.6 pyhd8ed1ab_0 conda-forge netcdf-fortran 4.5.3 nompi_hdf192a6_105 conda-forge networkx 2.8.8 pyhd8ed1ab_0 conda-forge nglview 3.0.3 pyh8a188c0_0 conda-forge notebook 6.5.2 pyha770c72_1 conda-forge notebook-shim 0.2.2 pyhd8ed1ab_0 conda-forge numexpr 2.8.3 py39hecff1ad_1 conda-forge numpy 1.23.4 py39hdfa1d0c_1 conda-forge ocl_icd_wrapper_apple 1.0.0 hbcb3906_0 conda-forge openff-bespokefit 0.1.2 pyhd8ed1ab_0 conda-forge openff-forcefields 2.0.0 pyh6c4a22f_0 conda-forge openff-fragmenter-base 0.1.2 pyhd8ed1ab_1 conda-forge openff-qcsubmit 0.3.2 pyhd8ed1ab_0 conda-forge openff-toolkit-base 0.10.6 pyhd8ed1ab_0 conda-forge openff-utilities 0.1.7 pyh1a96a4e_0 conda-forge openjpeg 2.5.0 h5d0d7b0_1 conda-forge openmm 7.7.0 py39h8d72adf_0_khronos conda-forge openssl 1.1.1s hfd90126_0 conda-forge orc 1.8.0 ha9d861c_0 conda-forge packaging 21.3 pyhd8ed1ab_0 conda-forge packmol 20.010 h508aa58_0 conda-forge packmol-memgen 1.2.1rc0 pypi_0 pypi pandas 1.5.1 py39hecff1ad_1 conda-forge pandoc 2.19.2 h694c41f_1 conda-forge pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge parmed 3.4.3 py39h7a8716b_3 conda-forge parquet-cpp 1.5.1 2 conda-forge parso 0.8.3 pyhd8ed1ab_0 conda-forge pcre2 10.40 h1c4e4bc_0 conda-forge pdb4amber 20.1 pypi_0 pypi perl 5.32.1 2_h0d85af4_perl5 conda-forge pexpect 4.8.0 pyh1a96a4e_2 conda-forge pickleshare 0.7.5 py_1003 conda-forge pillow 9.2.0 py39h35d4919_3 conda-forge pint 0.20.1 pyhd8ed1ab_0 conda-forge pip 22.3.1 pyhd8ed1ab_0 conda-forge pixman 0.40.0 hbcb3906_0 conda-forge pkgutil-resolve-name 1.3.10 pyhd8ed1ab_0 conda-forge platformdirs 2.5.2 pyhd8ed1ab_1 conda-forge plotly 5.11.0 pyhd8ed1ab_0 conda-forge prometheus_client 0.15.0 pyhd8ed1ab_0 conda-forge prompt-toolkit 3.0.32 pyha770c72_0 conda-forge prompt_toolkit 3.0.32 hd8ed1ab_0 conda-forge psutil 5.9.4 py39ha30fb19_0 conda-forge pthread-stubs 0.4 hc929b4f_1001 conda-forge ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge py-cpuinfo 9.0.0 pyhd8ed1ab_0 conda-forge pyarrow 9.0.0 py39hb941c77_10_cpu conda-forge pybind11-abi 4 hd8ed1ab_3 conda-forge pycairo 1.21.0 py39h41776c8_2 conda-forge pycosat 0.6.4 py39ha30fb19_1 conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pydantic 1.10.2 py39ha30fb19_1 conda-forge pygments 2.13.0 pyhd8ed1ab_0 conda-forge pymbar 3.1.0 py39h7cc1f47_1 conda-forge pyopenssl 22.1.0 pyhd8ed1ab_0 conda-forge pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge pyrsistent 0.19.2 py39ha30fb19_0 conda-forge pysocks 1.7.1 pyha2e5f31_6 conda-forge pytables 3.6.1 py39hd07922a_3 conda-forge python 3.9.13 h57e37ff_0_cpython conda-forge python-constraint 1.4.0 py_0 conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python-fastjsonschema 2.16.2 pyhd8ed1ab_0 conda-forge python_abi 3.9 2_cp39 conda-forge pytraj 2.0.6 pypi_0 pypi pytz 2022.6 pyhd8ed1ab_0 conda-forge pyyaml 6.0 py39ha30fb19_5 conda-forge pyzmq 24.0.1 py39hed8f129_1 conda-forge qcelemental 0.25.1 pyhd8ed1ab_1 conda-forge qcengine 0.25.0 pyhd8ed1ab_0 conda-forge qcportal 0.15.8 pyhd8ed1ab_0 conda-forge rdkit 2022.09.1 py39h67dd817_0 conda-forge re2 2022.06.01 hb486fe8_0 conda-forge readline 8.1.2 h3899abd_0 conda-forge redis-py 4.3.4 pyhd8ed1ab_0 conda-forge redis-server 7.0.0 h25ffcba_0 conda-forge regex 2022.10.31 py39ha30fb19_0 conda-forge reportlab 3.5.68 py39hf37cc50_1 conda-forge reproc 14.2.3 h0d85af4_0 conda-forge reproc-cpp 14.2.3 he49afe7_0 conda-forge requests 2.28.1 pyhd8ed1ab_1 conda-forge rfc3986 1.5.0 pyhd8ed1ab_0 conda-forge rich 12.6.0 pyhd8ed1ab_0 conda-forge ruamel_yaml 0.15.80 py39ha30fb19_1008 conda-forge sander 16.0 pypi_0 pypi scipy 1.9.3 py39h8a15683_2 conda-forge send2trash 1.8.0 pyhd8ed1ab_0 conda-forge setuptools 65.5.1 pyhd8ed1ab_0 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge smirnoff99frosst 1.1.0 pyh44b312d_0 conda-forge snappy 1.1.9 h225ccf5_2 conda-forge sniffio 1.3.0 pyhd8ed1ab_0 conda-forge soupsieve 2.3.2.post1 pyhd8ed1ab_0 conda-forge sqlalchemy 1.4.44 py39ha30fb19_0 conda-forge sqlite 3.39.4 h9ae0607_0 conda-forge stack_data 0.6.1 pyhd8ed1ab_0 conda-forge starlette 0.21.0 pyhd8ed1ab_0 conda-forge tenacity 8.1.0 pyhd8ed1ab_0 conda-forge terminado 0.17.0 pyhd1c38e8_0 conda-forge tinycss2 1.2.1 pyhd8ed1ab_0 conda-forge tk 8.6.12 h5dbffcc_0 conda-forge toolz 0.12.0 pyhd8ed1ab_0 conda-forge tornado 6.2 py39ha30fb19_1 conda-forge torsiondrive 1.1.0 pyhd8ed1ab_0 conda-forge tqdm 4.64.1 pyhd8ed1ab_0 conda-forge traitlets 5.5.0 pyhd8ed1ab_0 conda-forge typing-extensions 4.4.0 hd8ed1ab_0 conda-forge typing_extensions 4.4.0 pyha770c72_0 conda-forge tzdata 2022f h191b570_0 conda-forge unicodedata2 15.0.0 py39ha30fb19_0 conda-forge urllib3 1.26.11 pyhd8ed1ab_0 conda-forge uvicorn 0.19.0 py39h6e9494a_0 conda-forge vine 5.0.0 pyhd8ed1ab_1 conda-forge wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge webencodings 0.5.1 py_1 conda-forge websocket-client 1.4.2 pyhd8ed1ab_0 conda-forge wheel 0.38.4 pyhd8ed1ab_0 conda-forge widgetsnbextension 4.0.3 pyhd8ed1ab_0 conda-forge wrapt 1.14.1 py39ha30fb19_1 conda-forge xmltodict 0.13.0 pyhd8ed1ab_0 conda-forge xorg-kbproto 1.0.7 h35c211d_1002 conda-forge xorg-libice 1.0.10 h0d85af4_0 conda-forge xorg-libsm 1.2.3 h0d85af4_1000 conda-forge xorg-libx11 1.7.2 h0d85af4_0 conda-forge xorg-libxau 1.0.9 h35c211d_0 conda-forge xorg-libxdmcp 1.1.3 h35c211d_0 conda-forge xorg-libxext 1.3.4 h0d85af4_1 conda-forge xorg-libxt 1.2.1 h0d85af4_2 conda-forge xorg-xextproto 7.3.0 h35c211d_1002 conda-forge xorg-xproto 7.0.31 h35c211d_1007 conda-forge xtb 6.5.1 h877ed2f_0 conda-forge xtb-python 20.2 py39h701faf5_5 conda-forge xz 5.2.6 h775f41a_0 conda-forge yaml 0.2.5 h0d85af4_2 conda-forge yaml-cpp 0.7.0 hf0c8a7f_2 conda-forge zeromq 4.3.4 he49afe7_1 conda-forge zipp 3.10.0 pyhd8ed1ab_0 conda-forge zlib 1.2.13 hfd90126_4 conda-forge zstd 1.5.2 hfa58983_4 conda-forge ```

j-wags commented 1 year ago

Here are my inputs, for me this finishes in about 500 seconds using ForceBalance 1.9.2 and 1.9.3.

2022_11_18_jw_repro.tar.gz

rvkrishnan30 commented 1 year ago

@rvkrishnan30 I'm getting to this today, but I'm unable to reproduce the slow runtime on this molecule - it just ran using an openeye-less env in a little over 1 hour (20 minutes fragmentation, 20 minutes QC generation, 20 minutes parameter optimization) on my macbook air.

Could you upload the complete forcebalance inputs (sdf, pdb, scan.xyz, force-field.offxml) so I can see if that reproduces?

My whole environment is:

Full env

@j-wags Thank you for testing this and sharing your inputs. This looks like a more reasonable timescale for ForceBalance runs. With your inputs, for me it took around 700 seconds and I am happy with it.

However, I wonder why it is taking 20 hrs in my actual case. Some differences I see between our inputs are:

penalty L1 vs. L2
convergence objective 0.01 vs. 0.1
qdata with vs. without GRADIENTS in it
more importantly the fragment molecule processed: I have all terminal groups -OH, -CH3 and -OCH3 groups in the fragment while you don't. I wonder if that's affecting the "referencing all energies to snapshot " step at the beginning of FB and also the MM optimizations during the run thereby slowing the whole process.

Attached are my inputs (sdf, pdb, scan.xyz, qdata.txt, and forcefield) and I hope this will help reproduce my case.
S1A_venkat_Nov2022.tar.gz

j-wags commented 1 year ago

Ahh, yeah. It looks like the structure at targets/3gid_D_S1A/input.sdf in your inputs has 2 hydroxyls, 3 methyls, and 4 methoxys that mine doesn't have. Especially with the hydrogens on the methyl/methoxy groups, that would add a lot more particles to optimize in the minimizations during the torsion fitting.

So fundamentally, we will eventually find macrocycles large enough to ruin things as long as we keep requiring that the entire ring gets included in the fragment we use for fitting. But the macrocycle here seems to be manageable, it's just that the substituents are somehow hanging on in your input, even after fragmentation.

Could you share your bespokefit run command, the output of running conda list, and input SDF/SMILES for the entire run? It looks like we're getting different results from the fragmentation step, so I can start debugging from there.

rvkrishnan30 commented 1 year ago

Could you share your bespokefit run command, the output of running conda list, and input SDF/SMILES for the entire run? It looks like we're getting different results from the fragmentation step, so I can start debugging from there.

I have a slightly different implementation of bespokefit without the use of qcsubmit as all data are locally generated. But the workflow and settings should predominantly be the same. Having said that, I am not sure why we end up with different fragments. With Ambertools WBO based fragmentation, I end up with a slightly smaller fragment than the parent but still have the -OH, -CH3 groups. (Attached are the original and fragment sdf files)

I wonder why it picks up these groups though they should not affect the WBO at the torsion we are talking about.

orig_frag_sdfs.tar.gz

mark-mackey-cresset commented 1 year ago

Ahh, yeah. It looks like the structure at targets/3gid_D_S1A/input.sdf in your inputs has 2 hydroxyls, 3 methyls, and 4 methoxys that mine doesn't have. Especially with the hydrogens on the methyl/methoxy groups, that would add a lot more particles to optimize in the minimizations during the torsion fitting.

I'm not convinced that just having an extra 13 heavy atoms in the system should be enough to make the runtime 60x longer: your optimisation stage took 20 mins, ours takes ~20 hours.

openforcefield / openff-bespokefit

Optimizations failing on large macromolecule #200