Open lh12565 opened 2 years ago
Hi, @mattragoza I run the code: "python3 generate.py config/generate.config". There is also an error:
Setting random seed to 0
Loading data
Initializing generative model
Loading generative model state
Initializing atom fitter
Initializing bond adder
Initializing output writer
Starting to generate grids
Calling generator (prior=0, stage2=False)
Getting next batch of data
==============================
*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is data/crossdock2020/B4GT1_HUMAN_125_398_0/2ah9_B_rec.pdb)
Traceback (most recent call last):
File "/NAS/lh/software/liGAN/new/LiGAN/generate.py", line 47, in <module>
main(sys.argv[1:])
File "/NAS/lh/software/liGAN/new/LiGAN/generate.py", line 42, in main
generator.generate(**config['generate'])
File "/NAS/lh/software/liGAN/new/LiGAN/ligan/generating.py", line 302, in generate
) = self.forward(
File "/NAS/lh/software/liGAN/new/LiGAN/ligan/generating.py", line 226, in forward
lig_gen_grids, latents, _, _ = self.gen_model(
File "/NAS/lh/software/miniconda3/envs/LiGAN/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/NAS/lh/software/liGAN/new/LiGAN/ligan/models.py", line 1130, in forward
(means, log_stds), _ = self.input_encoder(inputs)
File "/NAS/lh/software/miniconda3/envs/LiGAN/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/NAS/lh/software/liGAN/new/LiGAN/ligan/models.py", line 682, in forward
outputs = f(inputs)
File "/NAS/lh/software/miniconda3/envs/LiGAN/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/NAS/lh/software/liGAN/new/LiGAN/ligan/models.py", line 361, in forward
identity = self.skip_conv(inputs) if i == 0 else inputs
File "/NAS/lh/software/miniconda3/envs/LiGAN/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/NAS/lh/software/miniconda3/envs/LiGAN/lib/python3.9/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/NAS/lh/software/miniconda3/envs/LiGAN/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1137, in _call_impl
result = hook(self, input)
File "/NAS/lh/software/miniconda3/envs/LiGAN/lib/python3.9/site-packages/torch/nn/utils/spectral_norm.py", line 105, in __call__
setattr(module, self.name, self.compute_weight(module, do_power_iteration=module.training))
File "/NAS/lh/software/miniconda3/envs/LiGAN/lib/python3.9/site-packages/torch/nn/utils/spectral_norm.py", line 84, in compute_weight
v = normalize(torch.mv(weight_mat.t(), u), dim=0, eps=self.eps, out=v)
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemv(handle, op, m, n, &alpha, a, lda, x, incx, &beta, y, incy)`
This seems to be related to mismatch of nn.linear. But I don't know how to solve it. Thanks!
Hm, I have not seen this error. This post on the pytorch forum suggests that it might be resolved by reducing the batch size. I don't think it's a dimension mismatch as these tests should all be passing if you are up to date with the master branch.
How much GPU memory does your card have?
How much GPU memory does your card have?
I tried to reduce the batch size to 1, but the error is still there. My GPU memory is 32G.
If you run the code on the CPU, it may give a more informative error message. I just pushed a commit that lets you control this by setting the config option device: cpu
.
Can you try that and post the error message?
If you run the code on the CPU, it may give a more informative error message. I just pushed a commit that lets you control this by setting the config option
device: cpu
.Can you try that and post the error message?
Did you mean add device: cpu
to the config/generate.config? like this:
--- # new_atom_typing/generate/gen0_CVAE-1.6_1zyu_A_rec_0_1.0_1.0/generate.config
out_prefix: DEMO
model_type: CVAE
random_seed: 0
verbose: True
device: cpu
It still reported the same error.
Did you mean add device: cpu to the config/generate.config? like this:
Yes.
It still reported the same error.
Did you git pull
?
Did you mean add device: cpu to the config/generate.config? like this:
Yes.
It still reported the same error.
Did you
git pull
?
When I git pull via 'git pull --rebase', it works. But there is still some warnings. Part of the output is below:
Setting random seed to 0
Loading data
Initializing generative model
Loading generative model state
Initializing atom fitter
Initializing bond adder
Initializing output writer
Starting to generate grids
Calling generator (prior=0, stage2=False)
Getting next batch of data
==============================
*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is /NAS/lh/project/sclc/pharm/data/3htb/analy/rec.pdb)
Getting real input molecule from data root
.
Minimizing real molecule with gnina
_
(_)
__ _ _ __ _ _ __ __ _
/ _` | '_ \| | '_ \ / _` |
| (_| | | | | | | | | (_| |
\__, |_| |_|_|_| |_|\__,_|
__/ |
|___/
gnina v1.0 HEAD:6381355 Built Mar 6 2021.
gnina is based on smina and AutoDock Vina.
Please cite appropriately.
Commandline: gnina --minimize -r /NAS/lh/project/sclc/pharm/data/3htb/analy/rec.pdb -l /tmp/tmpznldvd27.sdf.gz --autobox_ligand /tmp/tmpznldvd27.sdf.gz -o /tmp/tmpx7l0p6ao.sdf.gz
Affinity: 0.00000 0.00000 (kcal/mol)
RMSD: 0.00000
CNNscore: 0.84549
CNNaffinity: 1.62269
CNNvariance: 0.03050
GNINA STDERR
==============================
*** Open Babel Warning in Init
Cannot initialize database 'space-groups.txt' which may cause further errors.
==============================
*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders
END GNINA STDERR
/NAS/lh/software/miniconda3/envs/LiGAN/lib/python3.9/site-packages/torch/cuda/memory.py:278: FutureWarning: torch.cuda.reset_max_memory_allocated now calls torch.cuda.reset_peak_memory_stats, which resets /all/ peak memory stats.
warnings.warn(
[example_idx=0 sample_idx=0 grid_type=rec] norm=23.7528 gpu=0.0000
Writing molecules/DEMO_0_rec_src.sdf.gz sample 0
[example_idx=0 sample_idx=0 grid_type=lig] norm=5.2755 gpu=0.0000
Writing molecules/DEMO_0_lig_src.sdf.gz sample 0
Writing molecules/DEMO_0_lig_src_pkt.sdf.gz sample 0
Writing molecules/DEMO_0_lig_src_uff.sdf.gz sample 0
Writing molecules/DEMO_0_lig_src_gna.sdf.gz sample 0
[example_idx=0 sample_idx=0 grid_type=lig_gen] norm=2.5592 gpu=0.0000
Writing latents/DEMO_0_lig_gen_0.latent
Fitting atoms to generated grid
Adding bonds to atoms from generated grid
Minimizing molecule from generated grid with UFFMinimizing molecule from generated grid with gnina
Writing molecules/DEMO_0_lig_gen_fit_add.sdf.gz sample 0
Writing molecules/DEMO_0_lig_gen_fit_add_pkt.sdf.gz sample 0
Writing molecules/DEMO_0_lig_gen_fit_add_uff.sdf.gz sample 0
Writing molecules/DEMO_0_lig_gen_fit_add_gna.sdf.gz sample 0
Computing metrics for example 0 sample 0
reading NP model ...
model in
sample_idx 0
lig_grid_norm 5.275463
lig_grid_elem_norm 2.359259
lig_grid_prop_norm 4.718518
lig_rec_prod 0.0
lig_rec_elem_prod 0.0
lig_rec_prop_prod 0.0
lig_n_atoms 3.0
lig_radius 0.0
lig_n_frags 1.0
lig_valid True
lig_reason Valid molecule
lig_MW 18.015
lig_logP -0.8247
lig_QED 0.327748
lig_SAS 7.50673
lig_NPS 0.0
lig_SMILES [H]O[H]
lig_UFF_init 0.440598
lig_UFF_min 0.0
lig_UFF_rmsd 0.700301
lig_UFF_error NaN
lig_UFF_time 0.00251
lig_vina_aff 0.0
lig_vina_rmsd 0.0
lig_cnn_pose 0.845489
lig_cnn_aff 1.622691
lig_gnina_error NaN
lig_gen_grid_norm 2.559225
lig_gen_grid_elem_norm 1.275613
lig_gen_grid_prop_norm 2.218825
lig_gen_L2_loss 7.230287
lig_gen_elem_L2_loss 1.449093
lig_gen_prop_L2_loss 5.781195
lig_gen_shape_sim 0.045793
lig_gen_rec_prod 0.0
lig_gen_rec_elem_prod 0.0
lig_gen_rec_prop_prod 0.0
lig_latent_norm 10.682964
lig_latent_variance NaN
lig_gen_fit_grid_norm 0.0
lig_gen_fit_grid_elem_norm 0.0
lig_gen_fit_grid_prop_norm 0.0
lig_gen_fit_L2_loss 3.27683
lig_gen_fit_elem_L2_loss 0.81407
lig_gen_fit_prop_L2_loss 2.46276
lig_gen_fit_shape_sim 0.0
lig_gen_fit_rec_prod 0.0
lig_gen_fit_rec_elem_prod 0.0
lig_gen_fit_rec_prop_prod 0.0
lig_gen_fit_n_atoms 0.0
lig_gen_fit_radius NaN
lig_gen_fit_n_atoms_diff 1.0
lig_gen_fit_type_diff 5.0
lig_gen_fit_elem_diff 1.0
lig_gen_fit_prop_diff 4.0
lig_gen_fit_RMSD NaN
lig_gen_fit_time 2.773758
lig_gen_fit_n_visited 3.0
lig_gen_est_type_diff NaN
lig_gen_est_exact_types False
lig_gen_fit_add_n_atoms 0.0
lig_gen_fit_add_n_frags 0.0
lig_gen_fit_add_valid False
lig_gen_fit_add_reason No atoms
lig_gen_fit_add_MW 0.0
lig_gen_fit_add_logP 0.0
lig_gen_fit_add_QED 0.339424
lig_gen_fit_add_SAS NaN
lig_gen_fit_add_NPS NaN
lig_gen_fit_add_SMILES
lig_gen_fit_add_n_atoms_diff 0.0
lig_gen_fit_add_SMILES_match False
lig_gen_fit_add_ob_sim NaN
lig_gen_fit_add_rdkit_sim NaN
lig_gen_fit_add_morgan_sim NaN
lig_gen_fit_add_maccs_sim NaN
lig_gen_fit_add_UFF_init NaN
lig_gen_fit_add_UFF_min NaN
lig_gen_fit_add_UFF_rmsd NaN
lig_gen_fit_add_UFF_error No atoms
lig_gen_fit_add_UFF_time 0.000749
lig_gen_fit_add_UFF_init_diff NaN
lig_gen_fit_add_UFF_min_diff NaN
lig_gen_fit_add_UFF_rmsd_diff NaN
lig_gen_fit_add_vina_aff NaN
lig_gen_fit_add_vina_rmsd NaN
lig_gen_fit_add_cnn_pose NaN
lig_gen_fit_add_cnn_aff NaN
lig_gen_fit_add_gnina_error No atoms
lig_gen_fit_add_vina_aff_diff NaN
lig_gen_fit_add_vina_rmsd_diff NaN
lig_gen_fit_add_cnn_pose_diff NaN
lig_gen_fit_add_cnn_aff_diff NaN
lig_gen_fit_add_radius NaN
lig_gen_fit_add_type_diff 0.0
lig_gen_fit_add_elem_diff 0.0
lig_gen_fit_add_prop_diff 0.0
lig_gen_fit_add_RMSD NaN
Writing DEMO.gen_metrics
Writing DEMO.pymol
Freeing memory
I use a protein with no ligand. As said in the #44 , you said I can use any ligand (use tests/input/O_0_0_0.sdf) in prior sampling. And I also don't know where the binding pocket is. I just only use the protein and O_0_0_0.sdf in the file, like this:
0 0 0 rec.pdb O_0_0_0.sdf
The output is all H2O:
0
RDKit 3D
3 2 0 0 0 0 0 0 0 0999 V2000
0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 0.0000 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
0.0000 0.0000 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0
1 3 1 0
M END
$$$$
I am a beginner, so there are some issues that need to be solved: (1) Are these warnings ignored? Can I continue to use CUDA? (2) Change which parameters can distinguish the prior sampling and the posterior? (3) When there is no ligand and the binding pocket is unknown, can I directly provide the following files for drug design?
0 0 0 rec.pdb O_0_0_0.sdf
(4) If (3) yes, why all my results are the same: H2O? (5) What do the output files represent? (lig_gen_fit_add_gna.sdf.gz lig_gen_fit_add_pkt.sdf.gz lig_gen_fit_add.sdf.gz lig_gen_fit_add_uff.sdf.gz lig_src_gna.sdf.gz lig_src_pkt.sdf.gz lig_src.sdf.gz lig_src_uff.sdf.gz *rec_src.sdf.gz) Thanks!
Are these warnings ignored?
You can ignore OpenBabel warnings.
Can I continue to use CUDA?
If the code works when you use device: cpu
, it should work with device: cuda
as well.
Change which parameters can distinguish the prior sampling and the posterior?
The configuration setting generate: prior: False/True
controls whether you sample from the prior or posterior.
When there is no ligand and the binding pocket is unknown, can I directly provide the following files for drug design?
Yes, you should be able to do prior sampling with an empty/dummy input ligand. However, note that the receptor and ligand grids are centered on the input ligand file, and generated ligands will be centered there accordingly. So in this case, the grid will be centered at the origin, which might result in the receptor grid containing an arbitrary location on the receptor structure. It might not contain any receptor structure at all, depending on the receptor structure coordinates.
why all my results are the same: H2O?
This is probably because the grid is not centered correctly and contains very little of the receptor structure, so the model is generated a ligand based on mostly empty space. It's ok if you do not have an input ligand, but you at least need to decide where the grid will be centered, by changing the coordinates on the dummy atom in the file O_0_0_0.sdf
.
We have not tested this generative model on receptors with unknown binding pockets, so I can't say what the result will be or what the best approach is for discovering potential binding pockets. But at the very least you need to consider the fact that the entire receptor structure is unlikely to fit into the grid, so you need to decide where to center the grid. You could possible try scanning the entire receptor structure by generating many dummy ligand files centered on random locations or a uniform grid spanning the coordinate domain of the receptor structure, then generating ligands at each location and assessing their gnina score. I don't know how effective this would be but it's one possible approach!
What do the output files represent?
*_rec_src.sdf.gz The input receptor structure
*_lig_src.sdf.gz The input ligand structure
*_lig_src_uff.sdf.gz The input ligand minimized with UFF (internal energy)
*_lig_src_gna.sdf.gz The input ligand minimized with gnina (wrt recptor structure)
*_lig_src_pkt.sdf.gz The receptor pocket used for gnina minimization
*_lig_gen_fit_add.sdf.gz The generated molecule after bond adding
*_lig_gen_fit_add_uff.sdf.gz The generated molecule after UFF minimization (internal energy)
*_lig_gen_fit_add_gna.sdf.gz The generated molecule after gnina minimization (wrt receptor structure)
*_lig_gen_fit_add_pkt.sdf.gz The receptor pocket used for gnina minimization
Hi @mattragoza I guess I can use dock model, such as autodock, to determine ligand coordinates and the grids. right? Thanks!
Yes, that would work too.
Hi @mattragoza I still have some issues to solve it. (1)I try to run the two models (using prior: 0): protein with a known ligand and pocket, protein with the docked O_0_0_0.sdf. I found the result of the first model has the very similar output molecules as the known ligand, such as only change H to OH, change single bond to double bond, and there's no extension of molecules in the pocket. The second model has the same issue as above: all result are H2O but in a different part of the pocket. Are these normal results, or does it need to change parameters or something else? (2)When I change the device to cuda, there is still the same error as before:
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasSgemv(handle, op, m, n, &alpha, a, lda, x, incx, &beta, y, incy)`
(3)Before run the LiGAN, did I need to do something else with the protein or the ligand, such as adding hydrogens and charge? Thanks!
Hi, @mattragoza I installed LiGAN successfully. But when I run pytest tests, there is a lot of errors occurred:
I don't know if it's a cuda issue. Thanks!