gcorso / DiffDock

Implementation of DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking
https://arxiv.org/abs/2210.01776
MIT License
1.06k stars 255 forks source link

Unable to get .sdf or .mol2 files to work under Windows #35

Closed rbjacob closed 1 year ago

rbjacob commented 1 year ago

I have been trying to get DiffDock installed on a Windows server so I can test it with our structures and ligands of interest.

I have been running into difficulties. After getting everything installed and all the prerequisite packages working. I get the following errors/failures. These errors/failures occur when I use either an .sdf or a .mol2 file for the ligand. And even when I include a Smiles code and the program supposedly completes, the results make no sense. The molecule basically blows apart, or is nowhere near the target PDB.

''' python -m inference --protein_ligand_csv data/protein_ligand_trial3_csv.csv --out_dir results/user_predictions_small3 --inference_steps 20 --samples_per_complex 40 --batch_size 10 --actual_steps 18 --no_final_step_noise loading data from memory: data/cache_torsion\limit0_INDEX_maxLigSizeNone_H0_recRad15.0_recMax24_esmEmbeddings3467677806\heterographs.pkl Number of complexes: 1 radius protein: mean 25.799917221069336, std 0.0, max 25.799917221069336 radius molecule: mean 3.531266689300537, std 0.0, max 3.531266689300537 distance protein-mol: mean 11.676636695861816, std 0.0, max 11.676636695861816 rmsd matching: mean 0.0, std 0.0, max 0 HAPPENING | confidence model uses different type of graphs than the score model. Loading (or creating if not existing) the data for the confidence model now. loading data from memory: data/cache_torsion_allatoms\limit0_INDEX_maxLigSizeNone_H0_recRad15.0_recMax24_atomRad5_atomMax8_esmEmbeddings3467677806\heterographs.pkl Number of complexes: 1 radius protein: mean 25.799917221069336, std 0.0, max 25.799917221069336 radius molecule: mean 3.7641730308532715, std 0.0, max 3.7641730308532715 distance protein-mol: mean 11.22496223449707, std 0.0, max 11.22496223449707 rmsd matching: mean 0.0, std 0.0, max 0 common t schedule [1. 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05] Size of test dataset: 1 0it [00:00, ?it/s]### C:\Users\XXXXXXX\Miniconda3\envs\diffdock4\lib\site-packages\e3nn\o3_spherical_harmonics.py:82: UserWarning: FALLBACK path has been taken inside: torch::jit::fuser::cuda::compileCudaFusionGroup. This is an indication that codegen Failed for some reason. To debug try disable codegen fallback path via setting the env variable export PYTORCH_NVFUSER_DISABLE=fallback To report the issue, try enable logging via setting the envvariable export PYTORCH_JIT_LOG_LEVEL=manager.cpp (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\codegen\cuda\manager.cpp:244.) sh = _spherical_harmonics(self._lmax, x[..., 0], x[..., 1], x[..., 2]) C:\Users\XXXXXXXX\DiffDock-main\utils\torsion.py:60: RuntimeWarning: invalid value encountered in true_divide rot_vec = rot_vec * torsion_updates[idx_edge] / np.linalg.norm(rot_vec) # idx_edge! Failed on ['data/trial-3/Ap_GST_Phi2.pdb____data/trial-3/L_glufosinate.mol2'] linalg.svd: The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated singular values (error code: 2). 1it [00:07, 7.58s/it] Failed for 1 complexes Skipped 0 complexes Results are in results/user_predictions_small3 '''

So, ok, maybe I'll just try using smiles representations instead. When I use the isomeric smiles string I get the same. I am just showing the warning and error portions.

""" C:\Users\XXXXXXX\Miniconda3\envs\diffdock4\lib\site-packages\e3nn\o3_spherical_harmonics.py:82: UserWarning: FALLBACK path has been taken inside: torch::jit::fuser::cuda::compileCudaFusionGroup. This is an indication that codegen Failed for some reason. To debug try disable codegen fallback path via setting the env variable export PYTORCH_NVFUSER_DISABLE=fallback To report the issue, try enable logging via setting the envvariable export PYTORCH_JIT_LOG_LEVEL=manager.cpp (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\codegen\cuda\manager.cpp:244.) sh = _spherical_harmonics(self._lmax, x[..., 0], x[..., 1], x[..., 2]) C:\Users\XXXXXXX\DiffDock-main\utils\torsion.py:60: RuntimeWarning: invalid value encountered in true_divide rot_vec = rot_vec * torsion_updates[idx_edge] / np.linalg.norm(rot_vec) # idx_edge! Failed on ['data/trial-1/Ap_GST_Phi2.pdb____C(CC(=O)NC@@HC(=O)NCC(=O)O)C@@HN'] linalg.svd: The algorithm failed to converge because the input matrix is ill-conditioned or has too many repeated singular values (error code: 2). 1it [00:07, 7.16s/it] Failed for 1 complexes Skipped 0 complexes Results are in results/user_predictions_small1 """

However, when I list a ligand as a canonical smiles string

""" python -m inference --protein_ligand_csv data/protein_ligand_trial3_csv.csv --out_dir results/user_predictions_small3 --inference_steps 20 --samples_per_complex 40 --batch_size 10 --actual_steps 18 --no_final_step_noise Reading molecules and generating local structures with RDKit 1it [00:00, 22.30it/s] Reading language model embeddings. Generating graphs for ligands and proteins loading complexes: 100%|█████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.97it/s] loading data from memory: data/cache_torsion\limit0_INDEX_maxLigSizeNone_H0_recRad15.0_recMax24_esmEmbeddings3792232284\heterographs.pkl Number of complexes: 1 radius protein: mean 25.799917221069336, std 0.0, max 25.799917221069336 radius molecule: mean 5.837835311889648, std 0.0, max 5.837835311889648 distance protein-mol: mean 11.18027114868164, std 0.0, max 11.18027114868164 rmsd matching: mean 0.0, std 0.0, max 0 HAPPENING | confidence model uses different type of graphs than the score model. Loading (or creating if not existing) the data for the confidence model now. Reading molecules and generating local structures with RDKit 1it [00:00, 27.60it/s] Reading language model embeddings. Generating graphs for ligands and proteins loading complexes: 100%|█████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.88it/s] loading data from memory: data/cache_torsion_allatoms\limit0_INDEX_maxLigSizeNone_H0_recRad15.0_recMax24_atomRad5_atomMax8_esmEmbeddings3792232284\heterographs.pkl Number of complexes: 1 radius protein: mean 25.799917221069336, std 0.0, max 25.799917221069336 radius molecule: mean 6.153195858001709, std 0.0, max 6.153195858001709 distance protein-mol: mean 11.254096031188965, std 0.0, max 11.254096031188965 rmsd matching: mean 0.0, std 0.0, max 0 common t schedule [1. 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05] Size of test dataset: 1 0it [00:00, ?it/s]C:\Users\XXXXXXX\Miniconda3\envs\diffdock4\lib\site-packages\e3nn\o3_spherical_harmonics.py:82: UserWarning: FALLBACK path has been taken inside: torch::jit::fuser::cuda::compileCudaFusionGroup. This is an indication that codegen Failed for some reason. To debug try disable codegen fallback path via setting the env variable export PYTORCH_NVFUSER_DISABLE=fallback To report the issue, try enable logging via setting the envvariable export PYTORCH_JIT_LOG_LEVEL=manager.cpp (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\jit\codegen\cuda\manager.cpp:244.) sh = _spherical_harmonics(self._lmax, x[..., 0], x[..., 1], x[..., 2]) 1it [00:57, 57.72s/it] Failed for 0 complexes Skipped 0 complexes Results are in results/user_predictions_small3 """ You'll notice that even when it successfully completes I still have the FALLBACK warning.

I don't know what's going on. Many of the ligands I am interested in are chiral compounds where one isomer is active and the other is not. I want to investigate the differences between the interactions.

Thanks for any assistance.

gcorso commented 1 year ago

Hi @rbjacob, this is very strange and we have never seen this behaviour before. Does the model work on the compounds from PDBBind (i.e. using these instructions)?

lonngxiang commented 1 year ago

How should I fill in the path of a single file?

python -m inference --protein_path data/1au3-target.pdb --ligand data/ligand-sele-h.mol2 --out_dir results/user_predictions_small --inference_steps 20 --samples_per_complex 40 --batch_size 10 --actual_steps 18 --no_final_step_noise

image image

For a single complex: specify the protein with, e.g., --protein_path protein.pdb and the ligand with --ligand ligand.sdf or --ligand "COc(cc1)ccc1C#N"

lonngxiang commented 1 year ago

Hi @rbjacob, this is very strange and we have never seen this behaviour before. Does the model work on the compounds from PDBBind (i.e. using these instructions)?

I have a similar problem;https://github.com/gcorso/DiffDock/issues/35#issuecomment-1313337444

rbjacob commented 1 year ago

As an update, I'm not sure that this is necessarily a DiffDock problem as much as an environment problem. I seem to be able to get everything to work when I install CPU only versions of PyTorch and Pyg. I have ran across a similar bug report for e3nn (https://github.com/e3nn/e3nn/issues/357) where this issue comes up. It seems to be something in nvfuser, though I'm not sure why I can't get around it.

There was no solution in the bug report other then to update pytorch. And the solution presented of setting the environmental variable "PYTORCH_JIT_USE_NNC_NOT_NVFUSER=1" to bypass the nvfuser doesn't work either.

lonngxiang commented 1 year ago

PYTORCH_JIT_USE_NNC_NOT_NVFUSER=1 @rbjacob I also use the cpu version, debug is probably stuck here

https://github.com/gcorso/DiffDock/blob/main/inference.py#L188 image

lonngxiang commented 1 year ago

@rbjacob My mistake should be caused by rdkit, but I haven't found the specific reason yet;

https://github.com/gcorso/DiffDock/blob/main/datasets/process_mols.py#L480


def write_mol_with_coords(mol, new_coords, path):
--
  | w = Chem.SDWriter(path)
  | conf = mol.GetConformer()
  | for i in range(mol.GetNumAtoms()):
  | x,y,z = new_coords.astype(np.double)[i]
  | conf.SetAtomPosition(i,Point3D(x,y,z))
  | w.write(mol)
  | w.close()

rdkit no eligible neighbors for chiral center

Strangely enough, this is the log I printed, and the error is in write; But I'm going to extract the results and the external run is fine

image

image

lonngxiang commented 1 year ago

@rbjacob
Here I use this web version to upload the same protein and small molecule; It works fine

https://huggingface.co/spaces/simonduerr/diffdock

gcorso commented 1 year ago

Thank you for the discussion and sorry for not being able to be more helpful, I hope this was resolved in the end!