Closed Kartinaa closed 7 months ago
Hello @Layne-Huang,
I solved the error I mentioned before, by uninstalling pycharm and installing it again for the version that is compatible with my cuda. Also, I degraded the version of pytroch_geometric.
I tried sampling molecules from given customized pockets, and it ran smoothly. However, when I was trying Sample novel molecules given seed fragments, the out put SMILES has a .
within the string, which caused the pdf file it generated doesn't have the generated fragment. The picture is attached:
All the sdf files are exactly the same; even the SMILES the algorithm generates are not the same. I'm not sure if my input is wrong or not, so I pasted it here:
Thanks for your time!
Thanks for your feedback! We have updated models/epsnet/MDM_pocket_coor_shared.py
already. Please use the newest one.
Thanks for the update! @Layne-Huang I have one more question: can I run the sample_frag multiple times to generate different molecules? Due to the limitation of my GPU, I can only generate 25 compounds once at a time, When I rerun the command, it always give me exact same compounds. Is there any way I can change the random seeds or something else so that I can generate different compounds several times?
Thanks for the update! @Layne-Huang I have one more question: can I run the sample_frag multiple times to generate different molecules? Due to the limitation of my GPU, I can only generate 25 compounds once at a time, When I rerun the command, it always give me exact same compounds. Is there any way I can change the random seeds or something else so that I can generate different compounds several times?
Hi, there are two methods to generate more molecules.
python -u sample_frag.py --ckpt <checkpoint> --pdb_path <pdb path> --mol_file <mole file> --keep_index <seed fragments index> --num_atom <num atom> --num_samples <number of samples> --sampling_type generalized --batch_size <batch_size>
Try to increase the value of num_samples and decrease the value of batch_sizeThanks a lot for replying! For the random seed, I found this in the sample_frag.py file. Is this the place that I can change?
Also, I really appreciate it if you can briefly explain what batach_size
means in the code, just trying to understand the arguments.
Thanks a lot for replying! For the random seed, I found this in the sample_frag.py file. Is this the place that I can change?
Also, I really appreciate it if you can briefly explain what
batach_size
means in the code, just trying to understand the arguments.
Yes, you could change the seed. batch_size
is the number of samples that the model calculates at the same time. You could refer to https://medium.com/data-science-365/all-you-need-to-know-about-batch-size-epochs-and-training-steps-in-a-neural-network-f592e12cdb0a
Thanks for the help, I really appreciate it! I'll close this issue and hope you have a great day!
Hi,
After reading your paper on NC, I downloaded and tried your code. However, there is an error showing
ValueError: too many values to unpack (expected 2)
. It seems like instead of returning two tuples forhidden_out
andcoors_out
, the function outputs a tuple with a length of 2880.My input is
python -u sample_for_pdb.py --ckpt 500.pt --pdb_path 6M1I_Pocket_pdb.pdb --num_atom 30 --num_samples 10 --sampling_type generalized
, the whole error meassage ish: 1: module: not found Entropy of n_nodes: H[N] -1.3862943649291992 [2024-04-12 14:01:23,597::test::INFO] Namespace(pdb_path='6M1I_Pocket_pdb.pdb', sdf_path=None, num_atom=30, build_method='reconstruct', config=None, cuda=True, ckpt='500.pt', save_traj=False, num_samples=10, batch_size=10, resume=None, tag='', clip=1000.0, n_steps=1000, global_start_sigma=inf, w_global_pos=1.0, w_local_pos=1.0, w_global_node=1.0, w_local_node=1.0, sampling_type='generalized', eta=1.0) [2024-04-12 14:01:23,597::test::INFO] {'model': {'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31}, 'train': {'seed': 2021, 'batch_size': 16, 'val_freq': 250, 'max_iters': 500, 'max_grad_norm': 10.0, 'num_workers': 4, 'anneal_power': 2.0, 'optimizer': {'type': 'adam', 'lr': 0.001, 'weight_decay': 0.0, 'beta1': 0.95, 'beta2': 0.999}, 'scheduler': {'type': 'plateau', 'factor': 0.6, 'patience': 10, 'min_lr': 1e-06}, 'transform': {'mask': {'type': 'mixed', 'min_ratio': 0.0, 'max_ratio': 1.2, 'min_num_masked': 1, 'min_num_unmasked': 0, 'p_random': 0.5, 'p_bfs': 0.25, 'p_invbfs': 0.25}, 'contrastive': {'num_real': 50, 'num_fake': 50, 'pos_real_std': 0.05, 'pos_fake_std': 2.0}}}, 'dataset': {'name': 'crossdock', 'type': 'pl', 'path': './data/crossdocked_pocket10', 'split': './data/split_by_name.pt'}} [2024-04-12 14:01:23,597::test::INFO] Loading crossdock data... Entropy of n_nodes: H[N] -3.543935775756836 [2024-04-12 14:01:23,597::test::INFO] Loading data... [2024-04-12 14:01:23,615::test::INFO] Building model... [2024-04-12 14:01:23,615::test::INFO] MDM_full_pocket_coor_shared {'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31} sdf idr: generate_ref Entropy of n_nodes: H[N] -3.543935775756836 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 314.14it/s] 0%| | 0/2 [00:00<?, ?it/s]1 /home/yang2531/Documents/Project/PMDM/models/common.py:485: UserWarning: torch.sparse.SparseTensor(indices, values, shape, *, device=) is deprecated. Please use torch.sparse_coo_tensor(indices, values, shape, dtype=, device=). (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:623.) bgraph_adj = torch.sparse.LongTensor( sample: 0it [00:00, ?it/s] 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/yang2531/Documents/Project/PMDM/sample_for_pdb.py", line 350, in <module> pos_gen, pos_gen_traj, atom_type, atom_traj = model.langevin_dynamics_sample( File "/home/yang2531/Documents/Project/PMDM/models/epsnet/MDM_pocket_coor_shared.py", line 790, in langevin_dynamics_sample net_out = self.net( File "/home/yang2531/Documents/Project/PMDM/models/epsnet/MDM_pocket_coor_shared.py", line 478, in net node_attr_global, pos_attr_global = self.encoder_global( File "/home/yang2531/anaconda3/envs/mol/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/yang2531/anaconda3/envs/mol/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "/home/yang2531/Documents/Project/PMDM/models/encoders/egnn.py", line 476, in forward x = layer(x, edge_index, edge_attr, batch=batch, ligand_batch=ligand_batch, size=bsize, linker_mask=linker_mask) File "/home/yang2531/anaconda3/envs/mol/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/yang2531/anaconda3/envs/mol/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "/home/yang2531/Documents/Project/PMDM/models/encoders/egnn.py", line 215, in forward hidden_out, coors_out = self.propagate(edge_index, x=feats, edge_attr=edge_attr_feats, ValueError: too many values to unpack (expected 2)
Any help is hugely appreciated, your research is so fancinating that I really want to try to apply it!