Layne-Huang / PMDM

96 stars 21 forks source link

fail to run sample_for_pdb.py #10

Closed MachineGUN001 closed 5 months ago

MachineGUN001 commented 5 months ago

hi, layne,

thank you so much for providing such amazing work!

when I try to run sample_for_pdb.py, the error occured as below:

the command line I use for running

!python -u sample_for_pdb.py \
    --ckpt ckpt/500.pt \
        --pdb_path protein/4yhj.pdb \
                            --num_atom 50 \
                --num_samples 100 \
                    --sampling_type generalized

the error info:

Entropy of n_nodes: H[N] -1.3862943649291992
Entropy of n_nodes: H[N] -3.543935775756836
{'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31}
sdf idr: protein\generate_ref
Entropy of n_nodes: H[N] -3.543935775756836
'module' is not recognized as an internal or external command,
operable program or batch file.
[2024-04-08 11:35:18,967::test::INFO] Namespace(pdb_path='protein/4yhj.pdb', sdf_path=None, num_atom=50, build_method='reconstruct', config=None, cuda=True, ckpt='ckpt/500.pt', save_traj=False, num_samples=100, batch_size=10, resume=None, tag='', clip=1000.0, n_steps=1000, global_start_sigma=inf, w_global_pos=1.0, w_local_pos=1.0, w_global_node=1.0, w_local_node=1.0, sampling_type='generalized', eta=1.0)
[2024-04-08 11:35:18,968::test::INFO] {'model': {'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31}, 'train': {'seed': 2021, 'batch_size': 16, 'val_freq': 250, 'max_iters': 500, 'max_grad_norm': 10.0, 'num_workers': 4, 'anneal_power': 2.0, 'optimizer': {'type': 'adam', 'lr': 0.001, 'weight_decay': 0.0, 'beta1': 0.95, 'beta2': 0.999}, 'scheduler': {'type': 'plateau', 'factor': 0.6, 'patience': 10, 'min_lr': 1e-06}, 'transform': {'mask': {'type': 'mixed', 'min_ratio': 0.0, 'max_ratio': 1.2, 'min_num_masked': 1, 'min_num_unmasked': 0, 'p_random': 0.5, 'p_bfs': 0.25, 'p_invbfs': 0.25}, 'contrastive': {'num_real': 50, 'num_fake': 50, 'pos_real_std': 0.05, 'pos_fake_std': 2.0}}}, 'dataset': {'name': 'crossdock', 'type': 'pl', 'path': './data/crossdocked_pocket10', 'split': './data/split_by_name.pt'}}
[2024-04-08 11:35:18,968::test::INFO] Loading crossdock data...
[2024-04-08 11:35:18,969::test::INFO] Loading data...
[2024-04-08 11:35:19,787::test::INFO] Building model...
[2024-04-08 11:35:19,788::test::INFO] MDM_full_pocket_coor_shared

  0%|          | 0/20 [00:00<?, ?it/s]
 25%|██▌       | 5/20 [00:00<00:00, 44.90it/s]
 55%|█████▌    | 11/20 [00:00<00:00, 49.18it/s]
 90%|█████████ | 18/20 [00:00<00:00, 54.99it/s]
100%|██████████| 20/20 [00:00<00:00, 53.82it/s]

  0%|          | 0/20 [00:00<?, ?it/s]
  0%|          | 0/20 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "e:\Cheminfo_Workshop\4_Fragment_Scaffold_Evolution\PMDM-main\sample_for_pdb.py", line 344, in <module>
    batch = Batch.from_data_list(datas, follow_batch=FOLLOW_BATCH).to(device)
  File "c:\Users\LSY\.conda\envs\diffhopp\lib\site-packages\torch_geometric\data\batch.py", line 76, in from_data_list
    batch, slice_dict, inc_dict = collate(
  File "c:\Users\LSY\.conda\envs\diffhopp\lib\site-packages\torch_geometric\data\collate.py", line 85, in collate
    value, slices, incs = _collate(attr, values, data_list, stores,
  File "c:\Users\LSY\.conda\envs\diffhopp\lib\site-packages\torch_geometric\data\collate.py", line 134, in _collate
    incs = get_incs(key, values, data_list, stores)
  File "c:\Users\LSY\.conda\envs\diffhopp\lib\site-packages\torch_geometric\data\collate.py", line 267, in get_incs
    repeats = [
  File "c:\Users\LSY\.conda\envs\diffhopp\lib\site-packages\torch_geometric\data\collate.py", line 268, in <listcomp>
    data.__inc__(key, value, store)
  File "e:\Cheminfo_Workshop\4_Fragment_Scaffold_Evolution\PMDM-main\utils\data.py", line 35, in __inc__
    if 'ligand_element' in self.keys():
TypeError: 'list' object is not callable

my OS windows 10, with python 3.9 Pytorch version = 1.13.1+cu117 Pytorch Geometric version = 2.3.1 CUDA version = 11.7 CUDA available = True Random Pytorch test tensor = tensor([0.7748])

could you please provide the suggesions how to fix it?

btw: Is the generated molecule saved as a sdf file, and is it possible to define the path location of the stored file?

many thanks,

Best,

Layne-Huang commented 5 months ago

You could replace self.keys() with self.keys. Thanks!

The generated molecule will be saved as a sdf file, and you could find the save path in the code.

MachineGUN001 commented 5 months ago

@Layne-Huang thanks for your kind explanation.

instead, I upgraded the version of Pytorch Geometric to = 2.4.0. this problem was solved as well. Could this be the consequence of Pytorch Geometric version differences?

Layne-Huang commented 5 months ago

Yes, there are many incompatibilities of pyg after 2.0 versions, thus I also provide the codes of egnn for pgy before and after 2.3.0.

MachineGUN001 commented 5 months ago

many thanks again, and close it!

MachineGUN001 commented 5 months ago

@Layne-Huang

sorry for another question,

According to the SAVING molecules' code, each new molecule generated is saved as a separate SDF file. In the script, each time a new valid molecule is generated, the save_sdf function is called and a new filename is created for that molecule.

But I have been running for about 20 hours without seeing a single sdf file generated, is there a problem with the pdb protein file or ligand sdf file I am using? Can you provide an example file including protein pocket file (.pdb file) and ligand .sdf file?

many thanks,

Layne-Huang commented 5 months ago

I have tested the code again. There should be no issue. Please check these codes and try to print your saving path:

if save_sdf_flag:
    print('save')
    gen_file_name = '{}_{}.sdf'.format(pdb_name, str(num_samples))
    print(gen_file_name)
    save_sdf(gmol, sdf_dir, gen_file_name)
MachineGUN001 commented 5 months ago

thank you for checking the codes,

here are some outputs after interrupting the script:

^C
'module' is not recognized as an internal or external command,
operable program or batch file.
[2024-04-08 15:34:44,452::test::INFO] Namespace(pdb_path='protein/pro_A_YG1.pdb', sdf_path=None, num_atom=50, build_method='reconstruct', config=None, cuda=True, ckpt='ckpt/500.pt', save_traj=False, num_samples=50, batch_size=3, resume=None, tag='', clip=1000.0, n_steps=1000, global_start_sigma=inf, w_global_pos=1.0, w_local_pos=1.0, w_global_node=1.0, w_local_node=1.0, sampling_type='generalized', eta=1.0)
[2024-04-08 15:34:44,453::test::INFO] {'model': {'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31}, 'train': {'seed': 2021, 'batch_size': 16, 'val_freq': 250, 'max_iters': 500, 'max_grad_norm': 10.0, 'num_workers': 4, 'anneal_power': 2.0, 'optimizer': {'type': 'adam', 'lr': 0.001, 'weight_decay': 0.0, 'beta1': 0.95, 'beta2': 0.999}, 'scheduler': {'type': 'plateau', 'factor': 0.6, 'patience': 10, 'min_lr': 1e-06}, 'transform': {'mask': {'type': 'mixed', 'min_ratio': 0.0, 'max_ratio': 1.2, 'min_num_masked': 1, 'min_num_unmasked': 0, 'p_random': 0.5, 'p_bfs': 0.25, 'p_invbfs': 0.25}, 'contrastive': {'num_real': 50, 'num_fake': 50, 'pos_real_std': 0.05, 'pos_fake_std': 2.0}}}, 'dataset': {'name': 'crossdock', 'type': 'pl', 'path': './data/crossdocked_pocket10', 'split': './data/split_by_name.pt'}}
[2024-04-08 15:34:44,453::test::INFO] Loading crossdock data...
[2024-04-08 15:34:44,455::test::INFO] Loading data...
[2024-04-08 15:34:45,288::test::INFO] Building model...
[2024-04-08 15:34:45,289::test::INFO] MDM_full_pocket_coor_shared

  0%|          | 0/33 [00:00<?, ?it/s]
 45%|████▌     | 15/33 [00:00<00:00, 145.18it/s]
100%|██████████| 33/33 [00:00<00:00, 164.19it/s]
100%|██████████| 33/33 [00:00<00:00, 161.31it/s]

  0%|          | 0/33 [00:00<?, ?it/s]

sample: 0it [00:00, ?it/s]

sample: 1it [00:16, 16.47s/it]

sample: 2it [00:30, 15.21s/it]

sample: 3it [00:45, 14.88s/it]

sample: 4it [00:59, 14.66s/it]

sample: 5it [01:14, 14.57s/it]

sample: 6it [01:28, 14.52s/it]

sample: 7it [01:42, 14.46s/it]

sample: 8it [01:57, 14.41s/it]

.....

sample: 279it [1:06:58, 14.44s/it]

sample: 280it [1:07:13, 14.44s/it]

sample: 281it [1:07:27, 14.44s/it]

sample: 282it [1:07:42, 14.46s/it]

sample: 283it [1:07:56, 14.47s/it]
Entropy of n_nodes: H[N] -1.3862943649291992
Entropy of n_nodes: H[N] -3.543935775756836
{'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31}
sdf idr: protein\generate_ref
Entropy of n_nodes: H[N] -3.543935775756836
1
Invalid,continue
Invalid,continue
Invalid,continue
1
Invalid,continue
Invalid,continue
Invalid,continue
1
Invalid,continue
Invalid,continue
Invalid,continue
1
Invalid,continue
Invalid,continue
Invalid,continue
1
Invalid,continue
Invalid,continue
Invalid,continue
1

that looks no singe new molecule was generated as well as the related SDF file after running 22 hrs.

I'm not sure if the pdb file is suitable for sample_for_pdb.py. could you please provided example .pdb (pocket) file and ligand.sdf? many many thanks,

Best,

Layne-Huang commented 5 months ago

Please use this as an example: https://drive.google.com/file/d/12IQ2Pqah7Kw5yJgUfK-4ojFXZTv2_CB4/view?usp=drive_link.

MachineGUN001 commented 5 months ago

I tried to use the split_pocket_ligand.py script to split protein 7l11.pdb. two files including 7l11cut20_ligand.pdb and 7l11cut20_pocket.pdb were generated.

then run the below command line,

!python -u sample_for_pdb.py \
    --ckpt ckpt/500.pt \
        --pdb_path data/7l11cut20/7l11cut20_pocket.pdb \
                            --num_atom 20 \
                --num_samples 10 \
                 --batch_size 5\
                    --sampling_type generalized

this script could provide a pocket file with 20A cutoff.

if I used this pdb file and above commandline, does that work well?

thanks a lot for your help. and I'll check it further.

Best,

Layne-Huang commented 5 months ago

It should work but 20A is still a large pocket. You could try smaller like 10A or 6A.

MachineGUN001 commented 5 months ago

got it! the smaller size of pocket could spend less time for running.

MachineGUN001 commented 5 months ago

@Layne-Huang

sorry to bother you for the same problem.

I implemented the command line with the pdb file provided by you.

!python -u sample_for_pdb.py \
    --ckpt ckpt/500.pt \
        --pdb_path data/8h6tcut6_pocket.pdb \
                            --num_atom 50 \
                --num_samples 50 \
                 --batch_size 8 \
                    --sampling_type generalized

after running the script, however, there is no SDF files with new generated molecules.

Entropy of n_nodes: H[N] -1.3862943649291992
Entropy of n_nodes: H[N] -3.543935775756836
{'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31}
sdf idr: data\generate_ref
Entropy of n_nodes: H[N] -3.543935775756836
****
1
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
'module' is not recognized as an internal or external command,
operable program or batch file.
[2024-04-09 13:54:35,086::test::INFO] Namespace(batch_size=8, build_method='reconstruct', ckpt='ckpt/500.pt', clip=1000.0, config=None, cuda=True, eta=1.0, global_start_sigma=inf, n_steps=1000, num_atom=50, num_samples=50, pdb_path='data/8h6tcut6_pocket.pdb', resume=None, sampling_type='generalized', save_traj=False, sdf_path=None, tag='', w_global_node=1.0, w_global_pos=1.0, w_local_node=1.0, w_local_pos=1.0)
[2024-04-09 13:54:35,086::test::INFO] {'model': {'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31}, 'train': {'seed': 2021, 'batch_size': 16, 'val_freq': 250, 'max_iters': 500, 'max_grad_norm': 10.0, 'num_workers': 4, 'anneal_power': 2.0, 'optimizer': {'type': 'adam', 'lr': 0.001, 'weight_decay': 0.0, 'beta1': 0.95, 'beta2': 0.999}, 'scheduler': {'type': 'plateau', 'factor': 0.6, 'patience': 10, 'min_lr': 1e-06}, 'transform': {'mask': {'type': 'mixed', 'min_ratio': 0.0, 'max_ratio': 1.2, 'min_num_masked': 1, 'min_num_unmasked': 0, 'p_random': 0.5, 'p_bfs': 0.25, 'p_invbfs': 0.25}, 'contrastive': {'num_real': 50, 'num_fake': 50, 'pos_real_std': 0.05, 'pos_fake_std': 2.0}}}, 'dataset': {'name': 'crossdock', 'type': 'pl', 'path': './data/crossdocked_pocket10', 'split': './data/split_by_name.pt'}}
[2024-04-09 13:54:35,087::test::INFO] Loading crossdock data...
[2024-04-09 13:54:35,088::test::INFO] Loading data...
[2024-04-09 13:54:35,234::test::INFO] Building model...
[2024-04-09 13:54:35,235::test::INFO] MDM_full_pocket_coor_shared

  0%|          | 0/12 [00:00<?, ?it/s]
 50%|█████     | 6/12 [00:00<00:00, 57.45it/s]
100%|██████████| 12/12 [00:00<00:00, 59.16it/s]

  0%|          | 0/12 [00:00<?, ?it/s]

sample: 0it [00:00, ?it/s]

sample: 1it [00:00,  4.22it/s]

sample: 2it [00:00,  5.87it/s]

sample: 3it [00:00,  6.89it/s]

sample: 4it [00:00,  7.43it/s]

sample: 5it [00:00,  7.79it/s]

sample: 6it [00:00,  8.02it/s]
****
sample: 1000it [03:42,  5.16it/s]
sample: 1000it [03:42,  4.50it/s]
==============================
*** Open Babel Warning  in OpenBabel::OBMol::PerceiveBondOrders
  Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders

==============================
*** Open Babel Warning  in OpenBabel::OBMol::PerceiveBondOrders
  Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders

100%|██████████| 12/12 [43:10<00:00, 223.90s/it]
100%|██████████| 12/12 [43:10<00:00, 215.88s/it]
[2024-04-09 14:37:48,710::test::INFO] valid:0
[2024-04-09 14:37:48,710::test::INFO] stable:0

could you please provide suggestions how to fix it up?

many thanks,

Layne-Huang commented 5 months ago

Please decrease the size of molecules. It will generate valid molecules if you generate molecules which are no more than 40 atoms.

MachineGUN001 commented 5 months ago

@Layne-Huang

I decreased the --num_atom 20 by using the below command line,

!python -u sample_for_pdb.py \
    --ckpt ckpt/500.pt \
        --pdb_path data/8h6tcut6_pocket.pdb \
                            --num_atom 20 \
                --num_samples 25 \
                 --batch_size 10 \
                    --sampling_type generalized

but the same error occured.

Entropy of n_nodes: H[N] -1.3862943649291992
Entropy of n_nodes: H[N] -3.543935775756836
{'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31}
sdf idr: data\generate_ref
Entropy of n_nodes: H[N] -3.543935775756836
1
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
1
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
1
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
1
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
1
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
Invalid,continue
'module' is not recognized as an internal or external command,
operable program or batch file.
[2024-04-09 22:43:18,293::test::INFO] Namespace(batch_size=10, build_method='reconstruct', ckpt='ckpt/500.pt', clip=1000.0, config=None, cuda=True, eta=1.0, global_start_sigma=inf, n_steps=1000, num_atom=20, num_samples=25, pdb_path='data/7I11_pocket_8A.pdb', resume=None, sampling_type='generalized', save_traj=False, sdf_path=None, tag='', w_global_node=1.0, w_global_pos=1.0, w_local_node=1.0, w_local_pos=1.0)
[2024-04-09 22:43:18,294::test::INFO] {'model': {'type': 'diffusion', 'network': 'MDM_full_pocket_coor_shared', 'hidden_dim': 128, 'protein_hidden_dim': 128, 'num_convs': 3, 'num_convs_local': 3, 'protein_num_convs': 2, 'cutoff': 3.0, 'g_cutoff': 6.0, 'encoder_cutoff': 6.0, 'time_emb': True, 'atom_num_emb': False, 'mlp_act': 'relu', 'beta_schedule': 'sigmoid', 'beta_start': 1e-07, 'beta_end': 0.002, 'num_diffusion_timesteps': 1000, 'edge_order': 3, 'edge_encoder': 'mlp', 'smooth_conv': False, 'num_layer': 9, 'feats_dim': 5, 'soft_edge': True, 'norm_coors': True, 'm_dim': 128, 'context': 'None', 'vae_context': False, 'num_atom': 10, 'protein_feature_dim': 31}, 'train': {'seed': 2021, 'batch_size': 16, 'val_freq': 250, 'max_iters': 500, 'max_grad_norm': 10.0, 'num_workers': 4, 'anneal_power': 2.0, 'optimizer': {'type': 'adam', 'lr': 0.001, 'weight_decay': 0.0, 'beta1': 0.95, 'beta2': 0.999}, 'scheduler': {'type': 'plateau', 'factor': 0.6, 'patience': 10, 'min_lr': 1e-06}, 'transform': {'mask': {'type': 'mixed', 'min_ratio': 0.0, 'max_ratio': 1.2, 'min_num_masked': 1, 'min_num_unmasked': 0, 'p_random': 0.5, 'p_bfs': 0.25, 'p_invbfs': 0.25}, 'contrastive': {'num_real': 50, 'num_fake': 50, 'pos_real_std': 0.05, 'pos_fake_std': 2.0}}}, 'dataset': {'name': 'crossdock', 'type': 'pl', 'path': './data/crossdocked_pocket10', 'split': './data/split_by_name.pt'}}
[2024-04-09 22:43:18,294::test::INFO] Loading crossdock data...
[2024-04-09 22:43:18,295::test::INFO] Loading data...
[2024-04-09 22:43:18,467::test::INFO] Building model...
[2024-04-09 22:43:18,468::test::INFO] MDM_full_pocket_coor_shared

  0%|          | 0/5 [00:00<?, ?it/s]
100%|██████████| 5/5 [00:00<00:00, 147.06it/s]

  0%|          | 0/5 [00:00<?, ?it/s]

sample: 0it [00:00, ?it/s]

sample: 1it [00:00,  3.53it/s]

sample: 2it [00:00,  4.28it/s]

sample: 3it [00:00,  4.39it/s]

sample: 4it [00:00,  4.54it/s]

sample: 5it [00:01,  4.71it/s]

sample: 6it [00:01,  4.79it/s]

sample: 7it [00:01,  4.66it/s]

sample: 8it [00:01,  4.70it/s]

sample: 9it [00:01,  4.77it/s]

sample: 10it [00:02,  4.80it/s]

sample: 11it [00:02,  4.80it/s]

---
sample: 41it [00:08,  4.77it/s]

sample: 42it [00:09,  4.83it/s]

sample: 43it [00:09,  4.91it/s]
---

sample: 996it [08:36,  2.17it/s]

sample: 997it [08:37,  2.19it/s]

sample: 998it [08:37,  2.20it/s]

sample: 999it [08:38,  2.17it/s]

sample: 1000it [08:38,  2.17it/s]
sample: 1000it [08:38,  1.93it/s]
==============================
*** Open Babel Warning  in OpenBabel::OBMol::PerceiveBondOrders
  Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders

==============================
*** Open Babel Warning  in OpenBabel::OBMol::PerceiveBondOrders
  Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders

100%|██████████| 5/5 [40:19<00:00, 502.20s/it]
100%|██████████| 5/5 [40:19<00:00, 483.87s/it]
[2024-04-09 23:23:40,623::test::INFO] valid:0
[2024-04-09 23:23:40,624::test::INFO] stable:0

after running the script, there is no SDF generated in the generate_ref folder.

I'm not sure what problems for that. many thanks for your help.

Layne-Huang commented 5 months ago

This is my command python -u sample_for_pdb.py --ckpt 500.pt --pdb_path data/8h6tcut6/8h6tcut6_pocket.pdb --num_atom 20 --num_samples 10 --sampling_type generalized.

This is the an example of generated molecules: image

Please replace the code

except(RuntimeError, MolReconsError, TypeError, IndexError,
       OverflowError):  # MolReconsError,TypeError,IndexError,OverflowError
    print('Invalid,continue')

with

except (RuntimeError, MolReconsError, TypeError, IndexError, OverflowError) as e:
    print('An error occurred:', str(e))
    traceback.print_exc()

to see what specific error you have met.

MachineGUN001 commented 5 months ago

@Layne-Huang thank you so much to provide the suggestions.

following your codes, I revised the codes for printing errors in script sample_batch.py.

after running, the problems occured as same as previously. more details please see the attached file with outputs info outputs.txt

look forward to your help.

many thanks,