HBioquant / DiffBindFR

Diffusion model based protein-ligand flexible docking method
BSD 3-Clause Clear License
93 stars 13 forks source link

error about lmdb #7

Open Nireus-lgx opened 7 months ago

Nireus-lgx commented 7 months ago

hi,when running the example of reverse docking (python predict.py -l ./reverse/ligand_1.sdf ./reverse/ligand_2.sdf -p ./reverse/receptors -o ./test -np 40 -gpu 0 -cpu 16 -bs 16 -n reverse), I met an error: File "/DiffBindFR/druglib/datasets/lmdbdataset.py", line 67, in getitemraise ValueError(f'query index {idx.decode()} not in lmdb.') ValueError: query index 2src_protein not in lmdb.

HBioquant commented 7 months ago

Hi, @Nireus-lgx

The error you're encountering is most likely due to the code not properly handling proteins on your computer system. There could be several reasons for this:

The version of biopython you're using might not meet the required specifications. I would recommend using a version that is v1.80 or higher.

Additionally, please check if msms and dssp are functioning correctly on your system.

If you're certain that these aspects are not causing the issue, please provide more information or refer to the previous issue to try and resolve the problem. Looking forward to your feedback.

Best

stan1233 commented 7 months ago

Hi! @HBioquant The same issus. I referred to the previous issues, and your recommended requirements_reference, and installed the appropriate Biopython and numpy, but I still encountered such a problem. Is it possible to package this project into a docker container? This could solve all the environment setup issues.

$ python predict.py -i test_forward.csv -o ./test -np 40 -g                                                      pu 0 -cpu 16 -bs 16 -n forward
/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/Bio/pairwise2.py:278: BiopythonDeprecationWarning: Bio                                                      .pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please co                                                      nsider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.p                                                      airwise2 module.
  warnings.warn(
 ____  _  __  __ ____  _           _ _____ ____
|  _ \(_)/ _|/ _| __ )(_)_ __   __| |  ___|  _ \
| | | | | |_| |_|  _ \| | '_ \ / _` | |_  | |_) |
| |_| | |  _|  _| |_) | | | | | (_| |  _| |  _ <
|____/|_|_| |_| |____/|_|_| |_|\__,_|_|   |_| \_\
2024-04-15 21:47:40,865 - DiffBindFR - INFO - Total loaded jobs: 15.
2024-04-15 21:47:40,865 - DiffBindFR - INFO - Job Slice Info: (0, 15).
2024-04-15 21:47:40,866 - DiffBindFR - INFO - Running jobs: 15.
2024-04-15 21:47:40,870 - DiffBindFR - INFO - Start to prepare job (experiment name: forward).
Use Background Generator supported dataloader.
2024-04-15 21:47:40,873 - DiffBindFR - INFO - dock Status: Prep task is Done!
Initializing diffusion model...
2024-04-15 21:47:44,063 - DiffBindFR - INFO - load checkpoint from local path: /home/XXX/DiffBindFR/DiffBindFR/weights/                                                      diffbindfr_paper.pth
2024-04-15 21:47:44,949 - DiffBindFR - INFO - Running model inference...
[                                                  ] 0/560, elapsed: 0s, ETATraceback (most recent call last):
  File "/home/XXX/DiffBindFR/DiffBindFR/app/predict.py", line 265, in <module>
    sys.exit(main())
  File "/home/XXX/DiffBindFR/DiffBindFR/app/predict.py", line 260, in main
    runner(df, args)
  File "/home/XXX/DiffBindFR/DiffBindFR/app/predict.py", line 131, in runner
    pairs_results = common.inferencer(
  File "/home/XXX/DiffBindFR/DiffBindFR/common/engines.py", line 204, in inferencer
    model_out = model_run(dl, model, show_traj)
  File "/home/XXX/DiffBindFR/DiffBindFR/common/engines.py", line 174, in model_run
    for idx, data in enumerate(dl):
  File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/prefetch_generator/__init__.py", line 116, in                                                       next
    raise next_item
  File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/prefetch_generator/__init__.py", line 98, in r                                                      un
    for item in self.generator: self.queue.put((True , item))
  File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 628, in                                                       __next__
    data = self._next_data()
  File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1333, in                                                       _next_data
    return self._process_data(data)
  File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1359, in                                                       _process_data
    data.reraise()
  File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/_utils.py", line 543, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302,                                                       in _worker_loop
    data = fetcher.fetch(index)
  File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in                                                       fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in                                                       <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/XXX/DiffBindFR/druglib/datasets/custom_dataset.py", line 263, in __getitem__
    data = self.get(idx)
  File "/home/XXX/DiffBindFR/druglib/datasets/custom_dataset.py", line 326, in get
    return self._prepare_test_sample(self.indices[idx])
  File "/home/XXX/DiffBindFR/DiffBindFR/common/inference_dataset.py", line 583, in _prepare_test_sample
    mdl_inp = getattr(self.PairData, obj + 's')[row[name]].model_input
  File "/home/XXX/DiffBindFR/druglib/datasets/lmdbdataset.py", line 67, in __getitem__
    raise ValueError(f'query index {idx.decode()} not in lmdb.')
ValueError: query index 3dbs_protein not in lmdb.
Yang-Wang-2020 commented 5 months ago

I got the same issue. I tried lmdb-1.4.0 and lmdb-1.4.1. It still has the same issue.

$ python predict.py -i test_reverse.csv -o ./test -np 40 -gpu 0 -cpu 16 -bs 16 -n reverse --debug
/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/Bio/pairwise2.py:278: BiopythonDeprecationWarning: Bio.pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please consider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.pairwise2 module.
  warnings.warn(
 ____  _  __  __ ____  _           _ _____ ____
|  _ \(_)/ _|/ _| __ )(_)_ __   __| |  ___|  _ \
| | | | | |_| |_|  _ \| | '_ \ / _` | |_  | |_) |
| |_| | |  _|  _| |_) | | | | | (_| |  _| |  _ <
|____/|_|_| |_| |____/|_|_| |_|\__,_|_|   |_| \_\
+--------------------------------------------------------------------------------------------+
|                                   Command Line Parameter                                   |
+--------------------------+-----------------------------------------------------------------+
| KeyWord                  | Value                                                           |
+--------------------------+-----------------------------------------------------------------+
| input_csv                | /home/ubuntu/DiffBindFR/DiffBindFR/app/test_reverse.csv         |
| ligand                   |                                                                 |
| receptor                 |                                                                 |
| export_dir               | /home/ubuntu/DiffBindFR/DiffBindFR/app/test                     |
| config                   | /home/ubuntu/DiffBindFR/DiffBindFR/configs/diffbindfr_ts.py     |
| checkpoint               | /home/ubuntu/DiffBindFR/DiffBindFR/weights/diffbindfr_paper.pth |
| job                      | dock                                                            |
| num_poses                | 40                                                              |
| diffbindfr_pocket_radius | None                                                            |
| mdn_pocket_radius        | 12.0                                                            |
| start                    | None                                                            |
| end                      | None                                                            |
| interval                 | None                                                            |
| export_pocket            | False                                                           |
| no_error_correction      | False                                                           |
| no_mdn_scoring           | False                                                           |
| experiment_name          | reverse                                                         |
| show_traj                | False                                                           |
| evaluation               | False                                                           |
| report_performance       | False                                                           |
| cleanup                  | False                                                           |
| gpu_id                   | 0                                                               |
| num_workers              | 16                                                              |
| batch_size               | 16                                                              |
| verbose                  | False                                                           |
| override                 | False                                                           |
| seed                     | None                                                            |
| debug                    | True                                                            |
| cfg_options              | None                                                            |
+--------------------------+-----------------------------------------------------------------+
2024-06-04 19:53:34,017 - DiffBindFR - INFO - Total loaded jobs: 6.
2024-06-04 19:53:34,017 - DiffBindFR - INFO - Job Slice Info: (0, 6).
2024-06-04 19:53:34,017 - DiffBindFR - INFO - Running jobs: 6.
2024-06-04 19:53:34,022 - DiffBindFR - INFO - Start to prepare job (experiment name: reverse).
Use Background Generator supported dataloader.
2024-06-04 19:53:34,024 - DiffBindFR - INFO - dock Status: Prep task is Done!
Initializing diffusion model...
2024-06-04 19:53:37,298 - DiffBindFR - INFO - load checkpoint from local path: /home/ubuntu/DiffBindFR/DiffBindFR/weights/diffbindfr_paper.pth
2024-06-04 19:53:38,470 - DiffBindFR - INFO - Running model inference...
[                                                  ] 0/120, elapsed: 0s, ETATraceback (most recent call last):
  File "/home/ubuntu/DiffBindFR/DiffBindFR/app/predict.py", line 265, in <module>
    sys.exit(main())
  File "/home/ubuntu/DiffBindFR/DiffBindFR/app/predict.py", line 260, in main
    runner(df, args)
  File "/home/ubuntu/DiffBindFR/DiffBindFR/app/predict.py", line 131, in runner
    pairs_results = common.inferencer(
  File "/home/ubuntu/DiffBindFR/DiffBindFR/common/engines.py", line 204, in inferencer
    model_out = model_run(dl, model, show_traj)
  File "/home/ubuntu/DiffBindFR/DiffBindFR/common/engines.py", line 174, in model_run
    for idx, data in enumerate(dl):
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/prefetch_generator/__init__.py", line 116, in next
    raise next_item
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/prefetch_generator/__init__.py", line 98, in run
    for item in self.generator: self.queue.put((True , item))
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
    data = self._next_data()
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
    return self._process_data(data)
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
    data.reraise()
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/_utils.py", line 543, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ubuntu/DiffBindFR/druglib/datasets/custom_dataset.py", line 263, in __getitem__
    data = self.get(idx)
  File "/home/ubuntu/DiffBindFR/druglib/datasets/custom_dataset.py", line 326, in get
    return self._prepare_test_sample(self.indices[idx])
  File "/home/ubuntu/DiffBindFR/DiffBindFR/common/inference_dataset.py", line 583, in _prepare_test_sample
    mdl_inp = getattr(self.PairData, obj + 's')[row[name]].model_input
  File "/home/ubuntu/DiffBindFR/druglib/datasets/lmdbdataset.py", line 67, in __getitem__
    raise ValueError(f'query index {idx.decode()} not in lmdb.')
ValueError: query index 2src_protein not in lmdb.

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/multiprocessing/popen_fork.py", line 27, in poll
    pid, sts = os.waitpid(self.pid, flag)
  File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 13400) is killed by signal: Terminated.

Below is my pip list:

$ pip list
Package                   Version                      Editable project location
------------------------- ---------------------------- -------------------------
absl-py                   2.1.0
addict                    2.4.0
aiohttp                   3.9.5
aiosignal                 1.3.1
AmberLite                 22.0
AmberUtils                21.0
annotated-types           0.7.0
antlr4-python3-runtime    4.9.3
anyio                     4.3.0
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.3.0
asttokens                 2.4.1
astunparse                1.6.3
async-lru                 2.0.4
async-timeout             4.0.3
attrs                     23.2.0
Babel                     2.14.0
backcall                  0.2.0
beautifulsoup4            4.12.3
biopython                 1.81
bleach                    6.1.0
blessed                   1.20.0
Brotli                    1.1.0
bson                      0.5.9
cached-property           1.5.2
cachetools                5.3.3
certifi                   2024.2.2
cffi                      1.16.0
cftime                    1.6.3
chardet                   5.2.0
charset-normalizer        3.3.2
click                     8.1.7
cmake                     3.29.3
colorama                  0.4.6
comm                      0.2.2
contextlib2               21.6.0
contourpy                 1.2.1
cycler                    0.12.1
Cython                    3.0.10
debugpy                   1.8.1
decorator                 5.1.1
deepsmiles                1.0.1
defusedxml                0.7.1
DiffBindFR                0.2.0                        /home/ubuntu/DiffBindFR
dill                      0.3.8
dm-tree                   0.1.8
docker-pycreds            0.4.0
docutils                  0.17
e3nn                      0.5.1
easydict                  1.10
einops                    0.5.0
entrypoints               0.4
exceptiongroup            1.2.0
executing                 2.0.1
fasteners                 0.19
fastjsonschema            2.19.1
fcd-torch                 1.0.7
filelock                  3.9.0
fonttools                 4.53.0
fqdn                      1.5.1
freetype-py               2.3.0
frozenlist                1.4.1
fsspec                    2024.6.0
gitdb                     4.0.11
GitPython                 3.1.43
gpustat                   1.1.1
greenlet                  3.0.3
GridDataFormats           1.0.2
grpcio                    1.64.1
h11                       0.14.0
h2                        4.1.0
hpack                     4.0.0
httpcore                  1.0.5
httpx                     0.27.0
hyperframe                6.0.1
idna                      3.7
importlib_metadata        7.1.0
importlib_resources       6.4.0
ipykernel                 6.19.2
ipython                   8.12.0
ipywidgets                8.1.3
isoduration               20.11.0
jedi                      0.19.1
Jinja2                    3.1.4
joblib                    1.1.0
json5                     0.9.25
jsonpointer               2.4
jsonschema                4.22.0
jsonschema-specifications 2023.12.1
jupyter_client            8.1.0
jupyter_core              5.3.0
jupyter-events            0.10.0
jupyter-lsp               2.2.5
jupyter_server            2.14.1
jupyter_server_terminals  0.5.3
jupyterlab                4.2.1
jupyterlab_pygments       0.3.0
jupyterlab_server         2.27.2
jupyterlab_widgets        3.0.11
kiwisolver                1.4.5
lightning-utilities       0.11.2
lit                       18.1.6
llvmlite                  0.42.0
lmdb                      1.4.1
lxml                      5.2.2
Markdown                  3.6
MarkupSafe                2.1.5
matplotlib                3.7.1
matplotlib-inline         0.1.7
mda-xdrlib                0.2.0
MDAnalysis                2.7.0
mdtraj                    1.10.0
meeko                     0.5.0
mistune                   3.0.2
ml-collections            0.1.1
MMPBSA.py                 16.0
mmtf-python               1.1.3
mpi4py                    3.1.6
mpmath                    1.3.0
mrcfile                   1.5.0
msgpack                   1.0.8
multidict                 6.0.5
munkres                   1.1.4
nbclient                  0.10.0
nbconvert                 7.16.4
nbformat                  5.10.4
nest_asyncio              1.6.0
netCDF4                   1.6.5
networkx                  3.2.1
nglview                   3.0.3
notebook                  7.2.0
notebook_shim             0.2.4
numba                     0.59.1
numexpr                   2.8.4
numpy                     1.26.4
nvidia-ml-py              12.555.43
omegaconf                 2.3.0
opencv-python             4.10.0.82
openff-amber-ff-ports     0+untagged.32.g809f411.dirty
openff-interchange        0.3.24
openff-models             0.1.2
openff-toolkit            0.16.0
openff-units              0.2.0
openff-utilities          0.1.12
openforcefields           2023.11.0
OpenMM                    7.7.0
openmmforcefields         0.12.0
opt-einsum                3.3.0
opt-einsum-fx             0.1.4
overrides                 7.7.0
packaging                 24.0
packmol-memgen            1.2.3rc0
pandarallel               1.6.3
pandas                    2.2.2
pandocfilters             1.5.0
panedr                    0.8.0
ParmEd                    4.2.2
parso                     0.8.4
pathtools                 0.1.2
pdb4amber                 22.0
pdbfixer                  1.8.1
pexpect                   4.9.0
pickleshare               0.7.5
pillow                    10.3.0
Pint                      0.23
pip                       24.0
pkgutil_resolve_name      1.3.10
platformdirs              4.2.2
ply                       3.11
Pmw                       2.0.1
posebusters               0.2.13
prefetch-generator        1.0.3
prettytable               3.6.0
ProDy                     2.3.1
prometheus_client         0.20.0
promise                   2.3
prompt-toolkit            3.0.42
protobuf                  3.20.3
psutil                    5.9.8
ptyprocess                0.7.0
pure-eval                 0.2.2
py-cpuinfo                9.0.0
py3Dmol                   2.0.4
pycairo                   1.26.0
pycparser                 2.22
pydantic                  2.7.3
pydantic_core             2.18.4
pyedr                     0.8.0
Pygments                  2.18.0
pymol                     3.0.0
pyMSMT                    22.0
pyparsing                 3.1.2
PyQt5                     5.15.10
PyQt5-sip                 12.13.0
PySocks                   1.7.1
python-constraint         1.4.0
python-dateutil           2.9.0
python-json-logger        2.0.7
pytraj                    2.0.6
pytz                      2024.1
PyYAML                    6.0.1
pyzmq                     26.0.3
rdkit                     2023.3.2
referencing               0.35.1
reportlab                 4.1.0
requests                  2.32.3
rfc3339-validator         0.1.4
rfc3986-validator         0.1.1
rlPyCairo                 0.2.0
rpds-py                   0.18.1
sander                    22.0
scikit-learn              1.1.3
scipy                     1.13.1
seaborn                   0.13.2
selfies                   2.1.1
Send2Trash                1.8.3
sentry-sdk                2.4.0
setproctitle              1.3.3
setuptools                59.5.0
shortuuid                 1.0.13
sip                       6.7.12
six                       1.16.0
smirnoff99frosst          0+unknown
smmap                     5.0.1
sniffio                   1.3.1
soupsieve                 2.5
spyrmsd                   0.5.2
SQLAlchemy                2.0.30
stack-data                0.6.2
sympy                     1.12.1
tables                    3.9.2
tensorboard               2.16.2
tensorboard-data-server   0.7.2
tensorboardX              2.6.2.2
terminado                 0.18.1
threadpoolctl             3.5.0
tinycss2                  1.3.0
tinydb                    4.8.0
toml                      0.10.2
tomli                     2.0.1
torch                     1.13.1+cu117
torch-cluster             1.6.0+pt113cu117
torch_geometric           2.5.3
torch-scatter             2.1.0+pt113cu117
torch-sparse              0.6.16+pt113cu117
torch-spline-conv         1.2.1+pt113cu117
torchaudio                0.13.1
torchmetrics              1.4.0.post0
torchtyping               0.1.4
torchvision               0.14.1+cu117
tornado                   6.4
tqdm                      4.66.4
traitlets                 5.14.3
triton                    2.0.0
typeguard                 4.0.0
types-python-dateutil     2.9.0.20240316
typing_extensions         4.11.0
typing-utils              0.1.0
tzdata                    2024.1
unicodedata2              15.1.0
uri-template              1.3.0
urllib3                   2.2.1
validators                0.28.3
wandb                     0.13.3
wcwidth                   0.2.13
webcolors                 1.13
webencodings              0.5.1
websocket-client          1.8.0
Werkzeug                  3.0.3
wheel                     0.43.0
widgetsnbextension        4.0.11
xmltodict                 0.13.0
yapf                      0.32.0
yarl                      1.9.4
zipp                      3.17.0

Thank you for your help!

yuanyuan-ma commented 5 months ago

Hi, I have the same mistake as above. Has this problem been solved? How should it be solved

HBioquant commented 2 months ago

Hi, guys,

I'm really sorry for the trouble. As I have been busy with other projects recently, I have not been able to catch up with the problems you have encountered. Now, I have reproduced the problem and provided the solution, see the link.

If you still encounter this problem, please return to me the log of the first time you encountered this problem. Because I found that your current log should be re-run, and the program will automatically load the lmdb that was run before.

Best