Open Nireus-lgx opened 7 months ago
Hi, @Nireus-lgx
The error you're encountering is most likely due to the code not properly handling proteins on your computer system. There could be several reasons for this:
The version of biopython you're using might not meet the required specifications. I would recommend using a version that is v1.80 or higher.
Additionally, please check if msms and dssp are functioning correctly on your system.
If you're certain that these aspects are not causing the issue, please provide more information or refer to the previous issue to try and resolve the problem. Looking forward to your feedback.
Best
Hi! @HBioquant The same issus. I referred to the previous issues, and your recommended requirements_reference, and installed the appropriate Biopython and numpy, but I still encountered such a problem. Is it possible to package this project into a docker container? This could solve all the environment setup issues.
$ python predict.py -i test_forward.csv -o ./test -np 40 -g pu 0 -cpu 16 -bs 16 -n forward
/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/Bio/pairwise2.py:278: BiopythonDeprecationWarning: Bio .pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please co nsider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.p airwise2 module.
warnings.warn(
____ _ __ __ ____ _ _ _____ ____
| _ \(_)/ _|/ _| __ )(_)_ __ __| | ___| _ \
| | | | | |_| |_| _ \| | '_ \ / _` | |_ | |_) |
| |_| | | _| _| |_) | | | | | (_| | _| | _ <
|____/|_|_| |_| |____/|_|_| |_|\__,_|_| |_| \_\
2024-04-15 21:47:40,865 - DiffBindFR - INFO - Total loaded jobs: 15.
2024-04-15 21:47:40,865 - DiffBindFR - INFO - Job Slice Info: (0, 15).
2024-04-15 21:47:40,866 - DiffBindFR - INFO - Running jobs: 15.
2024-04-15 21:47:40,870 - DiffBindFR - INFO - Start to prepare job (experiment name: forward).
Use Background Generator supported dataloader.
2024-04-15 21:47:40,873 - DiffBindFR - INFO - dock Status: Prep task is Done!
Initializing diffusion model...
2024-04-15 21:47:44,063 - DiffBindFR - INFO - load checkpoint from local path: /home/XXX/DiffBindFR/DiffBindFR/weights/ diffbindfr_paper.pth
2024-04-15 21:47:44,949 - DiffBindFR - INFO - Running model inference...
[ ] 0/560, elapsed: 0s, ETATraceback (most recent call last):
File "/home/XXX/DiffBindFR/DiffBindFR/app/predict.py", line 265, in <module>
sys.exit(main())
File "/home/XXX/DiffBindFR/DiffBindFR/app/predict.py", line 260, in main
runner(df, args)
File "/home/XXX/DiffBindFR/DiffBindFR/app/predict.py", line 131, in runner
pairs_results = common.inferencer(
File "/home/XXX/DiffBindFR/DiffBindFR/common/engines.py", line 204, in inferencer
model_out = model_run(dl, model, show_traj)
File "/home/XXX/DiffBindFR/DiffBindFR/common/engines.py", line 174, in model_run
for idx, data in enumerate(dl):
File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/prefetch_generator/__init__.py", line 116, in next
raise next_item
File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/prefetch_generator/__init__.py", line 98, in r un
for item in self.generator: self.queue.put((True , item))
File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
data = self._next_data()
File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
return self._process_data(data)
File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
data.reraise()
File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/_utils.py", line 543, in reraise
raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/XXX/mambaforge/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/XXX/DiffBindFR/druglib/datasets/custom_dataset.py", line 263, in __getitem__
data = self.get(idx)
File "/home/XXX/DiffBindFR/druglib/datasets/custom_dataset.py", line 326, in get
return self._prepare_test_sample(self.indices[idx])
File "/home/XXX/DiffBindFR/DiffBindFR/common/inference_dataset.py", line 583, in _prepare_test_sample
mdl_inp = getattr(self.PairData, obj + 's')[row[name]].model_input
File "/home/XXX/DiffBindFR/druglib/datasets/lmdbdataset.py", line 67, in __getitem__
raise ValueError(f'query index {idx.decode()} not in lmdb.')
ValueError: query index 3dbs_protein not in lmdb.
I got the same issue. I tried lmdb-1.4.0 and lmdb-1.4.1. It still has the same issue.
$ python predict.py -i test_reverse.csv -o ./test -np 40 -gpu 0 -cpu 16 -bs 16 -n reverse --debug
/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/Bio/pairwise2.py:278: BiopythonDeprecationWarning: Bio.pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please consider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.pairwise2 module.
warnings.warn(
____ _ __ __ ____ _ _ _____ ____
| _ \(_)/ _|/ _| __ )(_)_ __ __| | ___| _ \
| | | | | |_| |_| _ \| | '_ \ / _` | |_ | |_) |
| |_| | | _| _| |_) | | | | | (_| | _| | _ <
|____/|_|_| |_| |____/|_|_| |_|\__,_|_| |_| \_\
+--------------------------------------------------------------------------------------------+
| Command Line Parameter |
+--------------------------+-----------------------------------------------------------------+
| KeyWord | Value |
+--------------------------+-----------------------------------------------------------------+
| input_csv | /home/ubuntu/DiffBindFR/DiffBindFR/app/test_reverse.csv |
| ligand | |
| receptor | |
| export_dir | /home/ubuntu/DiffBindFR/DiffBindFR/app/test |
| config | /home/ubuntu/DiffBindFR/DiffBindFR/configs/diffbindfr_ts.py |
| checkpoint | /home/ubuntu/DiffBindFR/DiffBindFR/weights/diffbindfr_paper.pth |
| job | dock |
| num_poses | 40 |
| diffbindfr_pocket_radius | None |
| mdn_pocket_radius | 12.0 |
| start | None |
| end | None |
| interval | None |
| export_pocket | False |
| no_error_correction | False |
| no_mdn_scoring | False |
| experiment_name | reverse |
| show_traj | False |
| evaluation | False |
| report_performance | False |
| cleanup | False |
| gpu_id | 0 |
| num_workers | 16 |
| batch_size | 16 |
| verbose | False |
| override | False |
| seed | None |
| debug | True |
| cfg_options | None |
+--------------------------+-----------------------------------------------------------------+
2024-06-04 19:53:34,017 - DiffBindFR - INFO - Total loaded jobs: 6.
2024-06-04 19:53:34,017 - DiffBindFR - INFO - Job Slice Info: (0, 6).
2024-06-04 19:53:34,017 - DiffBindFR - INFO - Running jobs: 6.
2024-06-04 19:53:34,022 - DiffBindFR - INFO - Start to prepare job (experiment name: reverse).
Use Background Generator supported dataloader.
2024-06-04 19:53:34,024 - DiffBindFR - INFO - dock Status: Prep task is Done!
Initializing diffusion model...
2024-06-04 19:53:37,298 - DiffBindFR - INFO - load checkpoint from local path: /home/ubuntu/DiffBindFR/DiffBindFR/weights/diffbindfr_paper.pth
2024-06-04 19:53:38,470 - DiffBindFR - INFO - Running model inference...
[ ] 0/120, elapsed: 0s, ETATraceback (most recent call last):
File "/home/ubuntu/DiffBindFR/DiffBindFR/app/predict.py", line 265, in <module>
sys.exit(main())
File "/home/ubuntu/DiffBindFR/DiffBindFR/app/predict.py", line 260, in main
runner(df, args)
File "/home/ubuntu/DiffBindFR/DiffBindFR/app/predict.py", line 131, in runner
pairs_results = common.inferencer(
File "/home/ubuntu/DiffBindFR/DiffBindFR/common/engines.py", line 204, in inferencer
model_out = model_run(dl, model, show_traj)
File "/home/ubuntu/DiffBindFR/DiffBindFR/common/engines.py", line 174, in model_run
for idx, data in enumerate(dl):
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/prefetch_generator/__init__.py", line 116, in next
raise next_item
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/prefetch_generator/__init__.py", line 98, in run
for item in self.generator: self.queue.put((True , item))
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 628, in __next__
data = self._next_data()
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data
return self._process_data(data)
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data
data.reraise()
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/_utils.py", line 543, in reraise
raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/ubuntu/DiffBindFR/druglib/datasets/custom_dataset.py", line 263, in __getitem__
data = self.get(idx)
File "/home/ubuntu/DiffBindFR/druglib/datasets/custom_dataset.py", line 326, in get
return self._prepare_test_sample(self.indices[idx])
File "/home/ubuntu/DiffBindFR/DiffBindFR/common/inference_dataset.py", line 583, in _prepare_test_sample
mdl_inp = getattr(self.PairData, obj + 's')[row[name]].model_input
File "/home/ubuntu/DiffBindFR/druglib/datasets/lmdbdataset.py", line 67, in __getitem__
raise ValueError(f'query index {idx.decode()} not in lmdb.')
ValueError: query index 2src_protein not in lmdb.
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/multiprocessing/popen_fork.py", line 27, in poll
pid, sts = os.waitpid(self.pid, flag)
File "/home/ubuntu/miniconda3/envs/diffbindfr/lib/python3.9/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 13400) is killed by signal: Terminated.
Below is my pip list:
$ pip list
Package Version Editable project location
------------------------- ---------------------------- -------------------------
absl-py 2.1.0
addict 2.4.0
aiohttp 3.9.5
aiosignal 1.3.1
AmberLite 22.0
AmberUtils 21.0
annotated-types 0.7.0
antlr4-python3-runtime 4.9.3
anyio 4.3.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
asttokens 2.4.1
astunparse 1.6.3
async-lru 2.0.4
async-timeout 4.0.3
attrs 23.2.0
Babel 2.14.0
backcall 0.2.0
beautifulsoup4 4.12.3
biopython 1.81
bleach 6.1.0
blessed 1.20.0
Brotli 1.1.0
bson 0.5.9
cached-property 1.5.2
cachetools 5.3.3
certifi 2024.2.2
cffi 1.16.0
cftime 1.6.3
chardet 5.2.0
charset-normalizer 3.3.2
click 8.1.7
cmake 3.29.3
colorama 0.4.6
comm 0.2.2
contextlib2 21.6.0
contourpy 1.2.1
cycler 0.12.1
Cython 3.0.10
debugpy 1.8.1
decorator 5.1.1
deepsmiles 1.0.1
defusedxml 0.7.1
DiffBindFR 0.2.0 /home/ubuntu/DiffBindFR
dill 0.3.8
dm-tree 0.1.8
docker-pycreds 0.4.0
docutils 0.17
e3nn 0.5.1
easydict 1.10
einops 0.5.0
entrypoints 0.4
exceptiongroup 1.2.0
executing 2.0.1
fasteners 0.19
fastjsonschema 2.19.1
fcd-torch 1.0.7
filelock 3.9.0
fonttools 4.53.0
fqdn 1.5.1
freetype-py 2.3.0
frozenlist 1.4.1
fsspec 2024.6.0
gitdb 4.0.11
GitPython 3.1.43
gpustat 1.1.1
greenlet 3.0.3
GridDataFormats 1.0.2
grpcio 1.64.1
h11 0.14.0
h2 4.1.0
hpack 4.0.0
httpcore 1.0.5
httpx 0.27.0
hyperframe 6.0.1
idna 3.7
importlib_metadata 7.1.0
importlib_resources 6.4.0
ipykernel 6.19.2
ipython 8.12.0
ipywidgets 8.1.3
isoduration 20.11.0
jedi 0.19.1
Jinja2 3.1.4
joblib 1.1.0
json5 0.9.25
jsonpointer 2.4
jsonschema 4.22.0
jsonschema-specifications 2023.12.1
jupyter_client 8.1.0
jupyter_core 5.3.0
jupyter-events 0.10.0
jupyter-lsp 2.2.5
jupyter_server 2.14.1
jupyter_server_terminals 0.5.3
jupyterlab 4.2.1
jupyterlab_pygments 0.3.0
jupyterlab_server 2.27.2
jupyterlab_widgets 3.0.11
kiwisolver 1.4.5
lightning-utilities 0.11.2
lit 18.1.6
llvmlite 0.42.0
lmdb 1.4.1
lxml 5.2.2
Markdown 3.6
MarkupSafe 2.1.5
matplotlib 3.7.1
matplotlib-inline 0.1.7
mda-xdrlib 0.2.0
MDAnalysis 2.7.0
mdtraj 1.10.0
meeko 0.5.0
mistune 3.0.2
ml-collections 0.1.1
MMPBSA.py 16.0
mmtf-python 1.1.3
mpi4py 3.1.6
mpmath 1.3.0
mrcfile 1.5.0
msgpack 1.0.8
multidict 6.0.5
munkres 1.1.4
nbclient 0.10.0
nbconvert 7.16.4
nbformat 5.10.4
nest_asyncio 1.6.0
netCDF4 1.6.5
networkx 3.2.1
nglview 3.0.3
notebook 7.2.0
notebook_shim 0.2.4
numba 0.59.1
numexpr 2.8.4
numpy 1.26.4
nvidia-ml-py 12.555.43
omegaconf 2.3.0
opencv-python 4.10.0.82
openff-amber-ff-ports 0+untagged.32.g809f411.dirty
openff-interchange 0.3.24
openff-models 0.1.2
openff-toolkit 0.16.0
openff-units 0.2.0
openff-utilities 0.1.12
openforcefields 2023.11.0
OpenMM 7.7.0
openmmforcefields 0.12.0
opt-einsum 3.3.0
opt-einsum-fx 0.1.4
overrides 7.7.0
packaging 24.0
packmol-memgen 1.2.3rc0
pandarallel 1.6.3
pandas 2.2.2
pandocfilters 1.5.0
panedr 0.8.0
ParmEd 4.2.2
parso 0.8.4
pathtools 0.1.2
pdb4amber 22.0
pdbfixer 1.8.1
pexpect 4.9.0
pickleshare 0.7.5
pillow 10.3.0
Pint 0.23
pip 24.0
pkgutil_resolve_name 1.3.10
platformdirs 4.2.2
ply 3.11
Pmw 2.0.1
posebusters 0.2.13
prefetch-generator 1.0.3
prettytable 3.6.0
ProDy 2.3.1
prometheus_client 0.20.0
promise 2.3
prompt-toolkit 3.0.42
protobuf 3.20.3
psutil 5.9.8
ptyprocess 0.7.0
pure-eval 0.2.2
py-cpuinfo 9.0.0
py3Dmol 2.0.4
pycairo 1.26.0
pycparser 2.22
pydantic 2.7.3
pydantic_core 2.18.4
pyedr 0.8.0
Pygments 2.18.0
pymol 3.0.0
pyMSMT 22.0
pyparsing 3.1.2
PyQt5 5.15.10
PyQt5-sip 12.13.0
PySocks 1.7.1
python-constraint 1.4.0
python-dateutil 2.9.0
python-json-logger 2.0.7
pytraj 2.0.6
pytz 2024.1
PyYAML 6.0.1
pyzmq 26.0.3
rdkit 2023.3.2
referencing 0.35.1
reportlab 4.1.0
requests 2.32.3
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rlPyCairo 0.2.0
rpds-py 0.18.1
sander 22.0
scikit-learn 1.1.3
scipy 1.13.1
seaborn 0.13.2
selfies 2.1.1
Send2Trash 1.8.3
sentry-sdk 2.4.0
setproctitle 1.3.3
setuptools 59.5.0
shortuuid 1.0.13
sip 6.7.12
six 1.16.0
smirnoff99frosst 0+unknown
smmap 5.0.1
sniffio 1.3.1
soupsieve 2.5
spyrmsd 0.5.2
SQLAlchemy 2.0.30
stack-data 0.6.2
sympy 1.12.1
tables 3.9.2
tensorboard 2.16.2
tensorboard-data-server 0.7.2
tensorboardX 2.6.2.2
terminado 0.18.1
threadpoolctl 3.5.0
tinycss2 1.3.0
tinydb 4.8.0
toml 0.10.2
tomli 2.0.1
torch 1.13.1+cu117
torch-cluster 1.6.0+pt113cu117
torch_geometric 2.5.3
torch-scatter 2.1.0+pt113cu117
torch-sparse 0.6.16+pt113cu117
torch-spline-conv 1.2.1+pt113cu117
torchaudio 0.13.1
torchmetrics 1.4.0.post0
torchtyping 0.1.4
torchvision 0.14.1+cu117
tornado 6.4
tqdm 4.66.4
traitlets 5.14.3
triton 2.0.0
typeguard 4.0.0
types-python-dateutil 2.9.0.20240316
typing_extensions 4.11.0
typing-utils 0.1.0
tzdata 2024.1
unicodedata2 15.1.0
uri-template 1.3.0
urllib3 2.2.1
validators 0.28.3
wandb 0.13.3
wcwidth 0.2.13
webcolors 1.13
webencodings 0.5.1
websocket-client 1.8.0
Werkzeug 3.0.3
wheel 0.43.0
widgetsnbextension 4.0.11
xmltodict 0.13.0
yapf 0.32.0
yarl 1.9.4
zipp 3.17.0
Thank you for your help!
Hi, I have the same mistake as above. Has this problem been solved? How should it be solved
Hi, guys,
I'm really sorry for the trouble. As I have been busy with other projects recently, I have not been able to catch up with the problems you have encountered. Now, I have reproduced the problem and provided the solution, see the link.
If you still encounter this problem, please return to me the log of the first time you encountered this problem. Because I found that your current log should be re-run, and the program will automatically load the lmdb that was run before.
Best
hi,when running the example of reverse docking (
python predict.py -l ./reverse/ligand_1.sdf ./reverse/ligand_2.sdf -p ./reverse/receptors -o ./test -np 40 -gpu 0 -cpu 16 -bs 16 -n reverse
), I met an error: File "/DiffBindFR/druglib/datasets/lmdbdataset.py", line 67, in getitemraise ValueError(f'query index {idx.decode()} not in lmdb.') ValueError: query index 2src_protein not in lmdb.