HannesStark / EquiBind

EquiBind: geometric deep learning for fast predictions of the 3D structure in which a small molecule binds to a protein
MIT License
473 stars 109 forks source link

Not able to run with python 2. DGL-cuda without CUDA GPU. #3

Closed GsGithub17 closed 2 years ago

GsGithub17 commented 2 years ago

python inference.py --config=configs_clean/inference.yml File "inference.py", line 119 sys.stdout = Logger(logpath=os.path.join(os.path.dirname(args.checkpoint), f'inference.log'), syspart=sys.stdout) ^ SyntaxError: invalid syntax

HannesStark commented 2 years ago

Hi, Could you provide a bit more of the error messages? I do not know why this is happening. Are you potentially trying to run this with python 2?

Thanks!

GsGithub17 commented 2 years ago

Yes, python version was the issue. That is now resolved by creating py37 env. However, I'm now getting a new error.

python inference.py --config=configs_clean/inference.yml [10:55:12] /opt/dgl/src/runtime/tensordispatch.cc:43: TensorDispatcher: dlopen failed: libtorch_cuda.so: cannot open shared object file: No such file or directory Using backend: pytorch [2022-02-13 10:55:13.079302] [ Using Seed : 1 ] Traceback (most recent call last): File "inference.py", line 506, in inference_from_files(args) File "inference.py", line 333, in inference_from_files seed_all(args.seed) File "/home/gemechis/EquiBind/commons/utils.py", line 62, in seed_all dgl.random.seed(seed) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/random.py", line 18, in seed _CAPI_SetSeed(val) File "dgl/_ffi/_cython/./function.pxi", line 287, in dgl._ffi._cy3.core.FunctionBase.call File "dgl/_ffi/_cython/./function.pxi", line 222, in dgl._ffi._cy3.core.FuncCall File "dgl/_ffi/_cython/./function.pxi", line 211, in dgl._ffi._cy3.core.FuncCall3 File "dgl/_ffi/_cython/./base.pxi", line 155, in dgl._ffi._cy3.core.CALL dgl._ffi.base.DGLError: [10:55:13] /opt/dgl/src/random/random.cc:34: Check failed: e == CURAND_STATUS_SUCCESS: CURAND Error: CURAND_STATUS_INITIALIZATION_FAILED at /opt/dgl/src/random/random.cc:34 Stack trace: [bt] (0) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4f) [0x7ff249baa6af] [bt] (1) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(+0xd8ceb6) [0x7ff24a382eb6] [bt] (2) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(DGLFuncCall+0x48) [0x7ff24a390808] [bt] (3) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/_ffi/_cy3/core.cpython-37m-x86_64-linux-gnu.so(+0x17663) [0x7ff23747b663] [bt] (4) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/_ffi/_cy3/core.cpython-37m-x86_64-linux-gnu.so(+0x1798b) [0x7ff23747b98b] [bt] (5) python(_PyObject_FastCallKeywords+0x47b) [0x56552cc2472b] [bt] (6) python(+0x13b269) [0x56552cc25269] [bt] (7) python(_PyEval_EvalFrameDefault+0xaab) [0x56552cc9785b] [bt] (8) python(_PyFunction_FastCallKeywords+0x184) [0x56552cc221d4]

HannesStark commented 2 years ago

Could it be that you do not have a CUDA GPU?

HannesStark commented 2 years ago

I now added a environment_cpuonly.yml which you can use if you do not have a CUDA enabled GPU. You can install the environment as described in the README:

conda env create -f environment_cpuonly.yml

GsGithub17 commented 2 years ago

Thank you! Yes, the current machine doesn't have CUDA. I'll try recreating a new conda env without GPU and let you know.

GsGithub17 commented 2 years ago

Successfully, installed without CUDA. But, when I run a prediction, it's looking for ligand files in sdf format, I thought pdbqt files are also supported.

python inference.py --config=configs_clean/inference.yml Using backend: pytorch [2022-02-13 13:11:42.865877] [ Using Seed : 1 ] 0%| | 0/1 [00:00<?, ?it/s] Traceback (most recent call last): File "inference.py", line 506, in inference_from_files(args) File "inference.py", line 357, in inference_from_files lig = read_molecule(os.path.join(args.inference_path, name, f'{name}_ligand.sdf'), sanitize=True) File "/home/gemechis/EquiBind/commons/process_mols.py", line 1176, in read_molecule supplier = Chem.SDMolSupplier(molecule_file, sanitize=False, removeHs=False) OSError: File error: Bad input file /home/gemechis/EquiBind/inputdata/vo/vo_ligand.sdf

GsGithub17 commented 2 years ago

I converted the pdbqt ligand file to sdf using openbabel <obabel -ipdbqt vo_ligand.pdbqt -osdf -Ovo_ligand.sdf> and rerun the prediction, now I'm getting the following error associated to the receptor, it sounds like it is looking for file.

python inference.py --config=configs_clean/inference.yml Using backend: pytorch [2022-02-13 13:29:14.882930] [ Using Seed : 1 ] 0%| | 0/1 [00:00<?, ?it/s] Traceback (most recent call last): File "inference.py", line 506, in inference_from_files(args) File "inference.py", line 361, in inference_from_files rec, rec_coords, c_alpha_coords, n_coords, c_coords = get_receptor(rec_path,lig, cutoff=dp['chain_radius']) File "/home/gemechis/EquiBind/commons/process_mols.py", line 295, in get_receptor structure = biopython_parser.get_structure('random_id', rec_path) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/Bio/PDB/PDBParser.py", line 96, in get_structure with as_handle(file) as handle: File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/contextlib.py", line 112, in enter return next(self.gen) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/Bio/File.py", line 72, in as_handle with open(handleish, mode, **kwargs) as fp: FileNotFoundError: [Errno 2] No such file or directory: '/home/gemechis/EquiBind/inputdata/vo/rec.pdb'

GsGithub17 commented 2 years ago

I tried to change the receptor file to and rerun the prediction, however, I got the following response.

python inference.py --config=configs_clean/inference.yml Using backend: pytorch [2022-02-13 13:34:06.153129] [ Using Seed : 1 ] 0%| | 0/1 [00:00<?, ?it/s] Traceback (most recent call last): File "inference.py", line 506, in inference_from_files(args) File "inference.py", line 389, in inference_from_files ligs_coords_pred_untuned, ligs_keypts, recs_keypts, rotations, translations, geom_reg_loss = model(lig_graph.to(device), rec_graph.to(device), geometry_graph.to(device),complex_names=[name], epoch=0) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/home/gemechis/EquiBind/models/equibind.py", line 1020, in forward outputs = self.iegmn(lig_graph, rec_graph, geometry_graph, complex_names, epoch) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/gemechis/EquiBind/models/equibind.py", line 824, in forward geometry_graph=geometry_graph File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/home/gemechis/EquiBind/models/equibind.py", line 471, in forward lig_graph.update_all(self.update_x_moment_lig, fn.mean('m', 'x_update')) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/heterograph.py", line 4849, in update_all ndata = core.message_passing(g, message_func, reduce_func, apply_node_func) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/core.py", line 334, in message_passing ndata = invoke_gspmm(g, fn.copy_e(msg, msg), rfunc, edata=msgdata) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/core.py", line 297, in invoke_gspmm z = op(graph, x) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/ops/spmm.py", line 193, in func return gspmm(g, 'copy_rhs', reduce_op, None, x) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/ops/spmm.py", line 77, in gspmm lhs_data, rhs_data) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/backend/pytorch/sparse.py", line 503, in gspmm return GSpMM.apply(gidx, op, reduce_op, lhs_data, rhs_data) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/torch/cuda/amp/autocast_mode.py", line 94, in decorate_fwd return fwd(*args, **kwargs) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/backend/pytorch/sparse.py", line 123, in forward out, (argX, argY) = _gspmm(gidx, op, reduce_op, X, Y) File "/home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/sparse.py", line 162, in _gspmm arg_e_nd) File "dgl/_ffi/_cython/./function.pxi", line 287, in dgl._ffi._cy3.core.FunctionBase.call File "dgl/_ffi/_cython/./function.pxi", line 232, in dgl._ffi._cy3.core.FuncCall File "dgl/_ffi/_cython/./base.pxi", line 155, in dgl._ffi._cy3.core.CALL dgl._ffi.base.DGLError: [13:34:06] /opt/dgl/src/array/cpu/./spmm_blocking_libxsmm.h:267: Failed to generate libxsmm kernel for the SpMM operation! Stack trace: [bt] (0) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x4f) [0x7fcd494ae08f] [bt] (1) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(void dgl::aten::cpu::SpMMRedopCsrOpt<int, float, dgl::aten::cpu::op::CopyRhs, dgl::aten::cpu::op::Add >(dgl::BcastOff const&, dgl::aten::CSRMatrix const&, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray)+0x3bc) [0x7fcd49670e7c] [bt] (2) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(void dgl::aten::cpu::SpMMSumCsrLibxsmm<int, float, dgl::aten::cpu::op::CopyRhs >(dgl::BcastOff const&, dgl::aten::CSRMatrix const&, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray)+0x73) [0x7fcd49670f23] [bt] (3) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(void dgl::aten::cpu::SpMMSumCsr<int, float, dgl::aten::cpu::op::CopyRhs >(dgl::BcastOff const&, dgl::aten::CSRMatrix const&, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray)+0x146) [0x7fcd496998b6] [bt] (4) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(void dgl::aten::SpMMCsr<1, int, 32>(std::string const&, std::string const&, dgl::BcastOff const&, dgl::aten::CSRMatrix const&, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, std::vector<dgl::runtime::NDArray, std::allocator >)+0xfeb) [0x7fcd496a339b] [bt] (5) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(dgl::aten::SpMM(std::string const&, std::string const&, std::shared_ptr, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, std::vector<dgl::runtime::NDArray, std::allocator >)+0x1004) [0x7fcd496d2284] [bt] (6) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(+0x467e08) [0x7fcd496dce08] [bt] (7) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(+0x4683a1) [0x7fcd496dd3a1] [bt] (8) /home/gemechis/anaconda3/envs/equibind/lib/python3.7/site-packages/dgl/libdgl.so(DGLFuncCall+0x48) [0x7fcd49c7f278]

HannesStark commented 2 years ago

Thank you for pointing out the issue with the different file formats. I fixed it and updated the description in the README.

I am not sure what exactly causes your error dgl._ffi.base.DGLError: /opt/dgl/src/array/cpu/./spmm_blocking_libxsmm.h:267: Failed to generate libxsmm kernel for the SpMM operation!

But DGL has this issue on it: #3459 Is that related to your situation?

GsGithub17 commented 2 years ago

That is right, I'll try two other machines in addition, just to make sure this is machine dependent. I'll let you know asap.

HannesStark commented 2 years ago

Closing the issue since there is no follow up.

NiklasTR commented 2 years ago

would love to hear about a follow-up @GsGithub17 - currently setting up a very simple dockerfile for this project at labdao/lab-equibind and ran into this error when running CPU-only on my M1 Mac locally. Will ask another community member to run it on their infrastructure soon.