openforcefield / openff-bespokefit

Automated tools for the generation of bespoke SMIRNOFF format parameters for individual molecules.
https://docs.openforcefield.org/bespokefit
MIT License
59 stars 9 forks source link

Reproducing workflow from BespokeFit paper #356

Open aqemia-jasmin-guven opened 1 month ago

aqemia-jasmin-guven commented 1 month ago

Description

Hi!

First of all, I wanted to say thanks to @j-wags for having a chat with us about OpenFF tools! We had a very productive conversation and he encouraged me to raise an issue here about some of the questions I had about BespokeFit.

I'm trying to recreate the results from the BespokeFit paper to help me understand the tool before using it in new projects. My main point of confusion is how to run the workflow for a congeneric series of ligands, such as the TYK-2 set?

From the paper, I understood the workflow as follows:

  1. Run the openff-fragmenter on the whole series and save to JSON: Is this how the bespokefit_fragment_inputs.json file from the SI of the BespokeFit paper was generated? The main question I have here is, how do we generate the target torsion SMARTS strings with just the atoms of the central bond of the torsion labelled, instead of all of them?
  2. Run openff-bespokefit on just one ligand, e.g. EJM31.
  3. Remove duplicates from fragment list
  4. Turn off fragmentation in bespokefit
  5. Run bespokefit on custom fragments

Is the cache updated here at some point along the workflow as well?

Essentially, I am struggling to understand the workflow given in the python scripts from the paper Zenodo.

Additionally, Jeff mentioned that bespokefit should internally deduplicate the fragments, however I don't think I'm seeing this behaviour. For this, I launched the executor once and then submitted a single SDF containing all the ligands.

Thanks a lot in advance for your help!

Context

Software versions

Installed with environment.yml:

name: bsfit
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - ambertools=23
  - openff-bespokefit
  - openff-fragmenter
  - openff-forcefields
  - openff-qcsubmit
  - openff-toolkit
  - openmmforcefields
  - psi4=1.9
  - qcportal=0.15
  - awswrangler
Output of conda list

Please place the output of `conda list` here
# packages in environment at /home/ubuntu/miniconda3/envs/bsfit:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                  2_kmp_llvm    conda-forge
ambertools                23.6            cuda_None_nompi_py311h4a53416_105    conda-forge
amberutils                21.0                     pypi_0    pypi
amqp                      5.2.0              pyhd8ed1ab_1    conda-forge
anyio                     4.4.0              pyhd8ed1ab_0    conda-forge
argcomplete               3.4.0              pyhd8ed1ab_0    conda-forge
argon2-cffi               23.1.0             pyhd8ed1ab_0    conda-forge
argon2-cffi-bindings      21.2.0          py311h459d7ec_4    conda-forge
arpack                    3.9.1           nompi_h77f6705_101    conda-forge
arrow                     1.3.0              pyhd8ed1ab_0    conda-forge
asttokens                 2.4.1              pyhd8ed1ab_0    conda-forge
astunparse                1.6.3              pyhd8ed1ab_0    conda-forge
async-lru                 2.0.4              pyhd8ed1ab_0    conda-forge
async-timeout             4.0.3              pyhd8ed1ab_0    conda-forge
attrs                     23.2.0             pyh71513ae_0    conda-forge
aws-c-auth                0.7.22              hbd3ac97_10    conda-forge
aws-c-cal                 0.7.1                h87b94db_1    conda-forge
aws-c-common              0.9.23               h4ab18f5_0    conda-forge
aws-c-compression         0.2.18               he027950_7    conda-forge
aws-c-event-stream        0.4.2               h7671281_15    conda-forge
aws-c-http                0.8.2                he17ee6b_6    conda-forge
aws-c-io                  0.14.10              h826b7d6_1    conda-forge
aws-c-mqtt                0.10.4               hcd6a914_8    conda-forge
aws-c-s3                  0.6.0                h365ddd8_2    conda-forge
aws-c-sdkutils            0.1.16               he027950_3    conda-forge
aws-checksums             0.1.18               he027950_7    conda-forge
aws-crt-cpp               0.27.3               hda66527_2    conda-forge
aws-sdk-cpp               1.11.329             h46c3b66_9    conda-forge
awswrangler               3.9.0              pyhd8ed1ab_0    conda-forge
azure-core-cpp            1.13.0               h935415a_0    conda-forge
azure-identity-cpp        1.8.0                hd126650_2    conda-forge
azure-storage-blobs-cpp   12.11.0              hd2e3451_2    conda-forge
azure-storage-common-cpp  12.7.0               h10ac4d7_1    conda-forge
azure-storage-files-datalake-cpp 12.10.0              haa04155_2    conda-forge
babel                     2.14.0             pyhd8ed1ab_0    conda-forge
backports.zoneinfo        0.2.1           py311h38be061_8    conda-forge
basis_set_exchange        0.10               pyhd8ed1ab_1    conda-forge
beautifulsoup4            4.12.3             pyha770c72_0    conda-forge
billiard                  4.2.0           py311h459d7ec_0    conda-forge
bleach                    6.1.0              pyhd8ed1ab_0    conda-forge
blosc                     1.21.6               hef167b5_0    conda-forge
boto3                     1.34.148           pyhd8ed1ab_0    conda-forge
botocore                  1.34.148        pyge310_1234567_0    conda-forge
brotli                    1.1.0                hd590300_1    conda-forge
brotli-bin                1.1.0                hd590300_1    conda-forge
brotli-python             1.1.0           py311hb755f60_1    conda-forge
bson                      0.5.9                      py_0    conda-forge
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-ares                    1.32.3               h4bc722e_0    conda-forge
c-blosc2                  2.15.0               h6d6b9e4_1    conda-forge
ca-certificates           2024.7.4             hbcca054_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cachetools                5.4.0              pyhd8ed1ab_0    conda-forge
cairo                     1.18.0               hebfffa5_3    conda-forge
celery                    5.3.6              pyhd8ed1ab_0    conda-forge
certifi                   2024.7.4           pyhd8ed1ab_0    conda-forge
cffi                      1.16.0          py311hb3a22ac_0    conda-forge
chardet                   5.2.0           py311h38be061_1    conda-forge
charset-normalizer        3.3.2              pyhd8ed1ab_0    conda-forge
chemper                   1.0.1              pyhd8ed1ab_0    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
click-didyoumean          0.3.1              pyhd8ed1ab_0    conda-forge
click-option-group        0.5.6              pyhd8ed1ab_0    conda-forge
click-plugins             1.1.1                      py_0    conda-forge
click-repl                0.3.0              pyhd8ed1ab_0    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
comm                      0.2.2              pyhd8ed1ab_0    conda-forge
contourpy                 1.2.1           py311h9547e67_0    conda-forge
cudatoolkit               11.8.0              h4ba93d1_13    conda-forge
cycler                    0.12.1             pyhd8ed1ab_0    conda-forge
debugpy                   1.8.2           py311h4332511_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
dkh                       1.2                  hd59d2e7_0    conda-forge
edgembar                  0.2                      pypi_0    pypi
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.2              pyhd8ed1ab_0    conda-forge
executing                 2.0.1              pyhd8ed1ab_0    conda-forge
expat                     2.6.2                h59595ed_0    conda-forge
fastapi                   0.86.0             pyhd8ed1ab_0    conda-forge
fftw                      3.3.10          nompi_hf1063bd_110    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 h77eed37_2    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.53.1          py311h61187de_0    conda-forge
forcebalance              1.9.6           py311h2b7392c_2    conda-forge
fqdn                      1.5.1              pyhd8ed1ab_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
freetype-py               2.3.0              pyhd8ed1ab_0    conda-forge
future                    1.0.0              pyhd8ed1ab_0    conda-forge
gau2grid                  2.0.7                h4ab18f5_3    conda-forge
geometric                 1.0.2              pyhd8ed1ab_0    conda-forge
gflags                    2.2.2             he1b5a44_1004    conda-forge
glog                      0.7.1                hbabe93e_0    conda-forge
greenlet                  3.0.3           py311hb755f60_0    conda-forge
gtest                     1.14.0               h434a139_2    conda-forge
h11                       0.14.0             pyhd8ed1ab_0    conda-forge
h2                        4.1.0              pyhd8ed1ab_0    conda-forge
h5py                      3.11.0          nompi_py311h439e445_102    conda-forge
hdf4                      4.2.15               h2a13503_7    conda-forge
hdf5                      1.14.3          nompi_hdf9ad27_105    conda-forge
hpack                     4.0.0              pyh9f0ad1d_0    conda-forge
httpcore                  1.0.5              pyhd8ed1ab_0    conda-forge
httpx                     0.27.0             pyhd8ed1ab_0    conda-forge
hyperframe                6.0.1              pyhd8ed1ab_0    conda-forge
icu                       75.1                 he02047a_0    conda-forge
idna                      3.7                pyhd8ed1ab_0    conda-forge
importlib-metadata        8.2.0              pyha770c72_0    conda-forge
importlib_metadata        8.2.0                hd8ed1ab_0    conda-forge
importlib_resources       6.4.0              pyhd8ed1ab_0    conda-forge
ipykernel                 6.29.5             pyh3099207_0    conda-forge
ipython                   8.26.0             pyh707e725_0    conda-forge
ipywidgets                8.1.3              pyhd8ed1ab_0    conda-forge
isoduration               20.11.0            pyhd8ed1ab_0    conda-forge
jedi                      0.19.1             pyhd8ed1ab_0    conda-forge
jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
jmespath                  1.0.1              pyhd8ed1ab_0    conda-forge
joblib                    1.4.2              pyhd8ed1ab_0    conda-forge
json5                     0.9.25             pyhd8ed1ab_0    conda-forge
jsonpointer               3.0.0           py311h38be061_0    conda-forge
jsonschema                4.23.0             pyhd8ed1ab_0    conda-forge
jsonschema-specifications 2023.12.1          pyhd8ed1ab_0    conda-forge
jsonschema-with-format-nongpl 4.23.0               hd8ed1ab_0    conda-forge
jupyter-lsp               2.2.5              pyhd8ed1ab_0    conda-forge
jupyter_client            8.6.2              pyhd8ed1ab_0    conda-forge
jupyter_core              5.7.2           py311h38be061_0    conda-forge
jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
jupyter_server            2.14.2             pyhd8ed1ab_0    conda-forge
jupyter_server_terminals  0.5.3              pyhd8ed1ab_0    conda-forge
jupyterlab                4.2.4              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.3.0              pyhd8ed1ab_1    conda-forge
jupyterlab_server         2.27.3             pyhd8ed1ab_0    conda-forge
jupyterlab_widgets        3.0.11             pyhd8ed1ab_0    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.5           py311h9547e67_1    conda-forge
kombu                     5.3.7           py311h38be061_0    conda-forge
krb5                      1.21.3               h659f571_0    conda-forge
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.40                 hf3520f5_7    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libabseil                 20240116.2      cxx17_he02047a_1    conda-forge
libaec                    1.1.3                h59595ed_0    conda-forge
libarrow                  17.0.0           h0a637a3_1_cpu    conda-forge
libarrow-acero            17.0.0           he02047a_1_cpu    conda-forge
libarrow-dataset          17.0.0           he02047a_1_cpu    conda-forge
libarrow-substrait        17.0.0           hc9a23c6_1_cpu    conda-forge
libblas                   3.9.0            20_linux64_mkl    conda-forge
libboost                  1.84.0               h0ccab89_4    conda-forge
libboost-python           1.84.0          py311h06317a3_4    conda-forge
libbrotlicommon           1.1.0                hd590300_1    conda-forge
libbrotlidec              1.1.0                hd590300_1    conda-forge
libbrotlienc              1.1.0                hd590300_1    conda-forge
libcblas                  3.9.0            20_linux64_mkl    conda-forge
libcrc32c                 1.1.2                h9c3ff4c_0    conda-forge
libcurl                   8.9.0                hdb1bdb2_0    conda-forge
libdeflate                1.20                 hd590300_0    conda-forge
libecpint                 1.0.7               h3ecfda7_10    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libevent                  2.1.12               hf998b51_1    conda-forge
libexpat                  2.6.2                h59595ed_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 14.1.0               h77fa898_0    conda-forge
libgfortran-ng            14.1.0               h69a702a_0    conda-forge
libgfortran5              14.1.0               hc5f4f2c_0    conda-forge
libglib                   2.80.3               h8a4344b_1    conda-forge
libgomp                   14.1.0               h77fa898_0    conda-forge
libgoogle-cloud           2.26.0               h26d7fe4_0    conda-forge
libgoogle-cloud-storage   2.26.0               ha262f82_0    conda-forge
libgrpc                   1.62.2               h15f2491_0    conda-forge
libhwloc                  2.11.1          default_hecaa2ac_1000    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libint                    2.9.0                h9bbc0ff_0    conda-forge
libjpeg-turbo             3.0.0                hd590300_1    conda-forge
liblapack                 3.9.0            20_linux64_mkl    conda-forge
libnetcdf                 4.9.2           nompi_h135f659_114    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libparquet                17.0.0           h9e5060d_1_cpu    conda-forge
libpcm                    1.2.3                h4175798_8    conda-forge
libpng                    1.6.43               h2797004_0    conda-forge
libpq                     16.3                 ha72fbe1_0    conda-forge
libprotobuf               4.25.3               h08a7969_0    conda-forge
librdkit                  2024.03.5            h79cfef2_1    conda-forge
libre2-11                 2023.09.01           h5a48ba9_2    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libsqlite                 3.46.0               hde9e2c9_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              14.1.0               hc0a3c3a_0    conda-forge
libthrift                 0.19.0               hb90f79a_1    conda-forge
libtiff                   4.6.0                h1dd3fc0_3    conda-forge
libutf8proc               2.8.0                h166bdaf_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libwebp-base              1.4.0                hd590300_0    conda-forge
libxc-c                   6.2.2            cpu_h1b64f48_4    conda-forge
libxcb                    1.16                 hd590300_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxml2                   2.12.7               he7c6b58_4    conda-forge
libxslt                   1.1.39               h76b75d6_0    conda-forge
libzip                    1.10.1               h2629f0a_3    conda-forge
libzlib                   1.3.1                h4ab18f5_1    conda-forge
llvm-openmp               18.1.8               hf5423f3_0    conda-forge
lxml                      5.2.2           py311hc0a218f_0    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
lzo                       2.10              hd590300_1001    conda-forge
markdown-it-py            3.0.0              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.5           py311h459d7ec_0    conda-forge
matplotlib-base           3.9.1           py311hffb96ce_0    conda-forge
matplotlib-inline         0.1.7              pyhd8ed1ab_0    conda-forge
mda-xdrlib                0.2.0              pyhd8ed1ab_0    conda-forge
mdtraj                    1.10.0          py311h3f233a9_0    conda-forge
mdurl                     0.1.2              pyhd8ed1ab_0    conda-forge
mistune                   3.0.2              pyhd8ed1ab_0    conda-forge
mkl                       2023.2.0         h84fe81f_50496    conda-forge
mmpbsa-py                 16.0                     pypi_0    pypi
msgpack-python            1.0.8           py311h52f7536_0    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
nbclient                  0.10.0             pyhd8ed1ab_0    conda-forge
nbconvert-core            7.16.4             pyhd8ed1ab_1    conda-forge
nbformat                  5.10.4             pyhd8ed1ab_0    conda-forge
ncurses                   6.5                  h59595ed_0    conda-forge
nest-asyncio              1.6.0              pyhd8ed1ab_0    conda-forge
netcdf-fortran            4.6.1           nompi_h228c76a_104    conda-forge
networkx                  3.3                pyhd8ed1ab_1    conda-forge
nglview                   3.1.2              pyhceb8b5e_1    conda-forge
notebook                  7.2.1              pyhd8ed1ab_0    conda-forge
notebook-shim             0.2.4              pyhd8ed1ab_0    conda-forge
numexpr                   2.10.0          mkl_py311haeb1ab9_0    conda-forge
numpy                     1.26.4          py311h64a7726_0    conda-forge
ocl-icd                   2.3.2                hd590300_1    conda-forge
ocl-icd-system            1.0.0                         1    conda-forge
openff-amber-ff-ports     0.0.4              pyhca7485f_0    conda-forge
openff-bespokefit         0.2.3              pyhd8ed1ab_1    conda-forge
openff-forcefields        2024.07.0          pyhff2d567_0    conda-forge
openff-fragmenter         0.2.2              pyhd8ed1ab_0    conda-forge
openff-fragmenter-base    0.2.2              pyhd8ed1ab_0    conda-forge
openff-interchange        0.3.18             pyhd8ed1ab_0    conda-forge
openff-interchange-base   0.3.18             pyhd8ed1ab_0    conda-forge
openff-models             0.1.2              pyhca7485f_0    conda-forge
openff-qcsubmit           0.5.0              pyhd8ed1ab_0    conda-forge
openff-toolkit            0.14.5             pyhd8ed1ab_1    conda-forge
openff-toolkit-base       0.14.5             pyhd8ed1ab_1    conda-forge
openff-units              0.2.2              pyhca7485f_0    conda-forge
openff-utilities          0.1.12             pyhd8ed1ab_0    conda-forge
openjpeg                  2.5.2                h488ebb8_0    conda-forge
openmm                    8.1.2           py311he040c58_2    conda-forge
openmmforcefields         0.14.1             pyhd8ed1ab_0    conda-forge
openssl                   3.3.1                h4bc722e_2    conda-forge
optking                   0.2.1              pyhd8ed1ab_0    conda-forge
orc                       2.0.1                h17fec99_1    conda-forge
overrides                 7.7.0              pyhd8ed1ab_0    conda-forge
packaging                 23.2               pyhd8ed1ab_0    conda-forge
packmol-memgen            2024.2.9                 pypi_0    pypi
pandas                    2.2.2           py311h14de704_1    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
panedr                    0.8.0              pyhd8ed1ab_0    conda-forge
parmed                    4.2.2           py311hb755f60_1    conda-forge
parso                     0.8.4              pyhd8ed1ab_0    conda-forge
pcmsolver                 1.2.3                      py_9    conda-forge
pcre2                     10.44                h0f59acf_0    conda-forge
pdb4amber                 22.0                     pypi_0    pypi
perl                      5.32.1          7_hd590300_perl5    conda-forge
pexpect                   4.9.0              pyhd8ed1ab_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    10.4.0          py311h82a398c_0    conda-forge
pint                      0.23               pyhd8ed1ab_1    conda-forge
pip                       24.0               pyhd8ed1ab_0    conda-forge
pixman                    0.43.2               h59595ed_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
platformdirs              4.2.2              pyhd8ed1ab_0    conda-forge
plotly                    5.23.0             pyhd8ed1ab_0    conda-forge
prometheus_client         0.20.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.47             pyha770c72_0    conda-forge
prompt_toolkit            3.0.47               hd8ed1ab_0    conda-forge
psi4                      1.9.1           py311he3e7f2e_3    conda-forge
psutil                    6.0.0           py311h331c9d8_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pugixml                   1.14                 h59595ed_0    conda-forge
pure_eval                 0.2.3              pyhd8ed1ab_0    conda-forge
py-cpuinfo                9.0.0              pyhd8ed1ab_0    conda-forge
pyarrow                   17.0.0          py311hbd00459_0    conda-forge
pyarrow-core              17.0.0          py311h9460f28_0_cpu    conda-forge
pybind11-abi              4                    hd8ed1ab_3    conda-forge
pycairo                   1.26.1          py311h64ab44a_0    conda-forge
pycparser                 2.22               pyhd8ed1ab_0    conda-forge
pydantic                  1.10.16         py311h331c9d8_0    conda-forge
pyedr                     0.8.0              pyhd8ed1ab_0    conda-forge
pygments                  2.18.0             pyhd8ed1ab_0    conda-forge
pymbar                    3.1.1           py311h7c22f60_3    conda-forge
pymsmt                    22.0                     pypi_0    pypi
pyparsing                 3.1.2              pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
pytables                  3.9.2           py311ha8f287f_3    conda-forge
python                    3.11.9          hb806964_0_cpython    conda-forge
python-constraint         1.4.0                      py_0    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.20.0             pyhd8ed1ab_0    conda-forge
python-json-logger        2.0.7              pyhd8ed1ab_0    conda-forge
python-tzdata             2024.1             pyhd8ed1ab_0    conda-forge
python_abi                3.11                    4_cp311    conda-forge
pytraj                    2.0.6                    pypi_0    pypi
pytz                      2024.1             pyhd8ed1ab_0    conda-forge
pyyaml                    6.0.1           py311h459d7ec_1    conda-forge
pyzmq                     26.0.3          py311h08a0b41_0    conda-forge
qcelemental               0.28.0             pyhd8ed1ab_0    conda-forge
qcengine                  0.30.0             pyhd8ed1ab_0    conda-forge
qcportal                  0.15.8             pyhd8ed1ab_0    conda-forge
qhull                     2020.2               h434a139_5    conda-forge
rdkit                     2024.03.5       py311h845bd92_1    conda-forge
re2                       2023.09.01           h7f4b329_2    conda-forge
readline                  8.2                  h8228510_1    conda-forge
redis-py                  5.0.7              pyhd8ed1ab_0    conda-forge
redis-server              7.2.5                he19d79f_0    conda-forge
referencing               0.35.1             pyhd8ed1ab_0    conda-forge
regex                     2024.7.24       py311h61187de_0    conda-forge
reportlab                 4.2.2           py311h331c9d8_0    conda-forge
requests                  2.32.3             pyhd8ed1ab_0    conda-forge
rfc3339-validator         0.1.4              pyhd8ed1ab_0    conda-forge
rfc3986-validator         0.1.1              pyh9f0ad1d_0    conda-forge
rich                      13.7.1             pyhd8ed1ab_0    conda-forge
rlpycairo                 0.2.0              pyhd8ed1ab_0    conda-forge
rpds-py                   0.19.1          py311hb3a8bbb_0    conda-forge
s2n                       1.4.17               he19d79f_0    conda-forge
s3transfer                0.10.2             pyhd8ed1ab_0    conda-forge
sander                    22.0                     pypi_0    pypi
scipy                     1.14.0          py311h517d4fd_1    conda-forge
send2trash                1.8.3              pyh0d859eb_0    conda-forge
setuptools                71.0.4             pyhd8ed1ab_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
smirnoff99frosst          1.1.0              pyh44b312d_0    conda-forge
snappy                    1.2.1                ha2e4443_0    conda-forge
sniffio                   1.3.1              pyhd8ed1ab_0    conda-forge
soupsieve                 2.5                pyhd8ed1ab_1    conda-forge
sqlalchemy                2.0.31          py311h331c9d8_0    conda-forge
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
starlette                 0.20.4             pyhd8ed1ab_1    conda-forge
tbb                       2021.12.0            h434a139_3    conda-forge
tenacity                  8.5.0              pyhd8ed1ab_0    conda-forge
terminado                 0.18.1             pyh0d859eb_0    conda-forge
tinycss2                  1.3.0              pyhd8ed1ab_0    conda-forge
tinydb                    4.8.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
tornado                   6.4.1           py311h331c9d8_0    conda-forge
torsiondrive              1.1.0              pyhd8ed1ab_0    conda-forge
tqdm                      4.66.4             pyhd8ed1ab_0    conda-forge
traitlets                 5.14.3             pyhd8ed1ab_0    conda-forge
types-python-dateutil     2.9.0.20240316     pyhd8ed1ab_0    conda-forge
typing-extensions         4.12.2               hd8ed1ab_0    conda-forge
typing_extensions         4.12.2             pyha770c72_0    conda-forge
typing_utils              0.1.0              pyhd8ed1ab_0    conda-forge
tzdata                    2024a                h0c530f3_0    conda-forge
unidecode                 1.3.8              pyhd8ed1ab_0    conda-forge
uri-template              1.3.0              pyhd8ed1ab_0    conda-forge
urllib3                   2.2.2              pyhd8ed1ab_1    conda-forge
uvicorn                   0.30.3          py311h38be061_0    conda-forge
validators                0.33.0             pyhd8ed1ab_0    conda-forge
vine                      5.1.0              pyhd8ed1ab_0    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
webcolors                 24.6.0             pyhd8ed1ab_0    conda-forge
webencodings              0.5.1              pyhd8ed1ab_2    conda-forge
websocket-client          1.8.0              pyhd8ed1ab_0    conda-forge
wheel                     0.43.0             pyhd8ed1ab_1    conda-forge
widgetsnbextension        4.0.11             pyhd8ed1ab_0    conda-forge
xmltodict                 0.13.0             pyhd8ed1ab_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.1.1                hd590300_0    conda-forge
xorg-libsm                1.2.4                h7391055_0    conda-forge
xorg-libx11               1.8.9                hb711507_1    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxrender           0.9.11               hd590300_0    conda-forge
xorg-libxt                1.3.0                hd590300_1    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zeromq                    4.3.5                h75354e8_4    conda-forge
zipp                      3.19.2             pyhd8ed1ab_0    conda-forge
zlib                      1.3.1                h4ab18f5_1    conda-forge
zlib-ng                   2.2.1                he02047a_0    conda-forge
zstandard                 0.23.0          py311h5cd10c7_0    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
j-wags commented 1 month ago

Thanks for following up, @aqemia-jasmin-guven. @jthorton is currently on vacation, but he should be able to provide more useful answers that I gave when he returns.

jthorton commented 1 month ago

Hi @aqemia-jasmin-guven thanks for trying out bespokefit!

From the paper, I understood the workflow as follows:

That's not quite the production workflow, you might be getting it a little confused with some of the examples we did in the paper which were slightly more complicated. In practice its as simple as just submitting a ligand to a running server and it will handle everything for you following the automated workflow defined here. You won't need to worry about deduplicating the fragments or making the smirks patterns this will all be done for you. I recommend starting with the quick start guide to ensure things are running as expected and then moving onto the TYK2 set.

Is the cache updated here at some point along the workflow as well?

The automated workflow will update the cache after every stage allowing the reuse of parameters and QC data, this is stored in the directory folder provided to the CLI in the redis.db file.

Additionally, Jeff mentioned that bespokefit should internally deduplicate the fragments, however I don't think I'm seeing this behaviour. For this, I launched the executor once and then submitted a single SDF containing all the ligands.

That is correct, this is the recommended way of running, in this mode each molecule will be fragmented and for any overlapping fragments (in TYK2 there are a lot) only a single set of QC calculations should be performed on each unique fragment. Is there something indicating this is not the case?

I hope this helps, let me know if you have any other issues!

aqemia-jasmin-guven commented 1 month ago

Hi @jthorton, thanks so much for getting back so quickly!

I have some follow-up questions to your reply:

That's not quite the production workflow, you might be getting it a little confused with some of the examples we did in the paper which were slightly more complicated. In practice its as simple as just submitting a ligand to a running server and it will handle everything for you following the automated workflow defined here. You won't need to worry about deduplicating the fragments or making the smirks patterns this will all be done for you.

Just to clarify, for production with multiple ligands, is it correct to input a single sdf containing all the liga

nds (which is what I have done for the TYK-2 ligands), or is it better to submit the individual ligands using separate submit commands, presumably in the same directory with the same executor?

If we're using separate commands, would it be possible to submit ligands on separate machines, e.g. with the distributed workers option from bespokefit?

I recommend starting with the quick start guide to ensure things are running as expected and then moving onto the TYK2 set.

So I actually already ran the acetaminophen example with the semi-empirical method, and didn't have problems there.

The automated workflow will update the cache after every stage allowing the reuse of parameters and QC data, this is stored in the directory folder provided to the CLI in the redis.db file.

Is it possible to include local files in this? For example, if we run the work flow for a series of ligands, and then afterwards want to run new molecules, sharing a common scaffold with the previous series, is it possible to update the local cache with the runs we have run ourselves? Is this what the --file option in the update cache command is for?

That is correct, this is the recommended way of running, in this mode each molecule will be fragmented and for any overlapping fragments (in TYK2 there are a lot) only a single set of QC calculations should be performed on each unique fragment. Is there something indicating this is not the case?

I ran the workflow with the TYK-2 ligands from a single sdf (attached input.sdf.zip) and ended up with 98 fragments in total. Is that expected? Is there a way to actually tell if QM data for a fragment was computed from scratch or if it was taken from the database? I think the reason I got confused was that I have the outputs and QM scans for all fragments and there are some duplicates across ligands, so I just assumed that these were all computed from scratch.

Thanks again for your help so far! Please let me know if I need to clarify any of the above.