Infinite stop on very small targets

wang3702 commented 8 months ago

Hello, recently I tried a very small target with map here https://purdue0-my.sharepoint.com/:u:/g/personal/wang3702_purdue_edu/EduOhi-uTwtFiuwnki9mq9kBr0BzYK1QknSzFx1fsJWRwQ?e=N3yJ3Z. and the sequence information below.

>pdb|6lu1|I
MMGSYAASFLPWIFIPVVCWLMPTVVMGLLFLYIEGEA

The model-angelo did not raise any errors but just stoped in the 1st GNN refinement stage. I waited more than 3 hours but it still stoped there. During the process, no error message is outputted. Could you please have a check to see what is wrong? Here is the log from model-angelo:

2023-10-26 at 17:20:05 | INFO | ModelAngelo with args: {'volume_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I_upsample.mrc', 'protein_fasta': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I.fasta', 'rna_fasta': None, 'dna_fasta': None, 'output_dir': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I', 'mask_path': None, 'device': '1', 'config_path': None, 'model_bundle_name': 'nucleotides', 'model_bundle_path': None, 'keep_intermediate_results': False, 'pipeline_control': '', 'func': <function main at 0x7f6d93a12d40>}
2023-10-26 at 17:20:05 | INFO | Initial C-alpha prediction with args: {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I_upsample.mrc', 'output_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I/see_alpha_output', 'mask_path': None, 'device': '1', 'auto_mask': False}
2023-10-26 at 17:20:05 | INFO | Using model file /apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/model.py
2023-10-26 at 17:20:05 | INFO | Using checkpoint file /apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/chkpt.torch
2023-10-26 at 17:20:06 | INFO | Input structure has shape: (86, 86, 86)
2023-10-26 at 17:20:06 | INFO | Running with these arguments:
2023-10-26 at 17:20:06 | INFO | {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I_upsample.mrc', 'output_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I/see_alpha_output', 'mask_path': None, 'device': '1', 'auto_mask': False}
2023-10-26 at 17:20:16 | INFO | Model prediction done, took 10.42 seconds for 8 sliding windows
2023-10-26 at 17:20:16 | INFO | Average time is 1302.313 ms
2023-10-26 at 17:20:16 | INFO | Starting Cα grid to points...
2023-10-26 at 17:20:16 | INFO | Have 347 Cα points before pruning and 62 after pruning
2023-10-26 at 17:20:16 | INFO | Starting P grid to points...
2023-10-26 at 17:20:16 | INFO | Have 2 P points before pruning and 0 after pruning
2023-10-26 at 17:20:16 | INFO | Finished inference!
2023-10-26 at 17:20:16 | INFO | GNN model refinement round 1 with args: {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': True, 'seq_attention_batch_size': 200, 'fp16': False, 'batch_size': 1, 'voxel_size': 1.0, 'map': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I_upsample.mrc', 'protein_fasta': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I.fasta', 'rna_fasta': None, 'dna_fasta': None, 'struct': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I/see_alpha_output/see_alpha_merged_output.cif', 'output_dir': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1I/gnn_output_round_1', 'model_dir': '/apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/gnn', 'device': '1', 'write_hmm_profiles': False, 'refine': False}
2023-10-26 at 17:20:16 | INFO | Loaded module from step: 483863

jamaliki commented 8 months ago

Hi @wang3702 ,

Thank you so much, this was a very interesting bug!!

I have pushed a fix to it as part of release 1.0.9 and it works on my machine now, please let me know if this is fixed.

Best, Kiarash.

wang3702 commented 8 months ago

Thank you so much for the quick check! I will pull the new version and see if it works or not. Will let you know soon.

wang3702 commented 8 months ago

It works for this target but fails on a even smaller target with error messages. The fasta

>pdb|6lu1|M
MALTDTQVYVALVIALLPAVLAFRLSTELYK

and the map https://purdue0-my.sharepoint.com/:u:/g/personal/wang3702_purdue_edu/EXavPKdcnapIoGVsByhMLFwBp2v5Gk18nEoJ95Kyy2cdaQ?e=zrlGDe.

The error message is attached

2023-10-27 at 11:04:54 | INFO | ModelAngelo with args: {'volume_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M_upsample.mrc', 'protein_fasta': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M.fasta', 'rna_fasta': None, 'dna_fasta': None, 'output_dir': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M', 'mask_path': None, 'device': '1', 'config_path': None, 'model_bundle_name': 'nucleotides', 'model_bundle_path': None, 'keep_intermediate_results': False, 'pipeline_control': '', 'func': <function main at 0x7fa83f856d40>}
2023-10-27 at 11:04:54 | INFO | Initial C-alpha prediction with args: {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M_upsample.mrc', 'output_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M/see_alpha_output', 'mask_path': None, 'device': '1', 'auto_mask': False}
2023-10-27 at 11:04:54 | INFO | Using model file /apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/model.py
2023-10-27 at 11:04:54 | INFO | Using checkpoint file /apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/chkpt.torch
2023-10-27 at 11:04:55 | INFO | Input structure has shape: (86, 86, 86)
2023-10-27 at 11:04:55 | INFO | Running with these arguments:
2023-10-27 at 11:04:55 | INFO | {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M_upsample.mrc', 'output_path': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M/see_alpha_output', 'mask_path': None, 'device': '1', 'auto_mask': False}
2023-10-27 at 11:05:05 | INFO | Model prediction done, took 10.45 seconds for 8 sliding windows
2023-10-27 at 11:05:05 | INFO | Average time is 1306.238 ms
2023-10-27 at 11:05:05 | INFO | Starting Cα grid to points...
2023-10-27 at 11:05:05 | INFO | Have 201 Cα points before pruning and 41 after pruning
2023-10-27 at 11:05:05 | INFO | Starting P grid to points...
2023-10-27 at 11:05:05 | INFO | Have 0 P points before pruning and 0 after pruning
2023-10-27 at 11:05:05 | INFO | Finished inference!
2023-10-27 at 11:05:05 | INFO | GNN model refinement round 1 with args: {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': True, 'seq_attention_batch_size': 200, 'fp16': False, 'batch_size': 1, 'voxel_size': 1.0, 'map': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M_upsample.mrc', 'protein_fasta': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M.fasta', 'rna_fasta': None, 'dna_fasta': None, 'struct': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M/see_alpha_output/see_alpha_merged_output.cif', 'output_dir': '/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M/gnn_output_round_1', 'model_dir': '/apps/model_angelo/weights/hub/checkpoints/model_angelo_v1.0/nucleotides/gnn', 'device': '1', 'write_hmm_profiles': False, 'refine': False}
2023-10-27 at 11:05:05 | INFO | Loaded module from step: 483863
2023-10-27 at 11:05:27 | ERROR | Error in ModelAngelo
Traceback (most recent call last):

    │    │    └ Namespace(volume_path='/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M_upsamp...
    │    └ <function main at 0x7fa83f856d40>
    └ Namespace(volume_path='/home/kihara/wang3702/turtle_scratch/model_angelo_test/deepmainmast_benchmark_singlechain/6lu1M_upsamp...

> File "/net/kihara-turtle-scratch/wang3702/model_angelo_test/model-angelo/model_angelo/apps/build.py", line 242, in main
    gnn_output = gnn_infer(gnn_infer_args)
                 │         └ {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': Tru...
                 └ <function infer at 0x7fa83f989120>

  File "/net/kihara-turtle-scratch/wang3702/model_angelo_test/model-angelo/model_angelo/gnn/inference.py", line 132, in infer
    idxs = argmin_random(
           └ <function argmin_random at 0x7fa83f856710>

  File "/net/kihara-turtle-scratch/wang3702/model_angelo_test/model-angelo/model_angelo/utils/gnn_inference_utils.py", line 38, in argmin_random
    neighbour_counts = counts[neighbours].sum(dim=-1)
                       │      └ tensor([[ 0,  3,  1,  ..., 41, 41, 41],
                       │                [ 1,  0,  3,  ..., 41, 41, 41],
                       │                [ 2, 13,  6,  ..., 41, 41, 41],
                       │             ...
                       └ tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
                                 0., 0., 0., 0...

IndexError: index 41 is out of bounds for dimension 0 with size 41

jamaliki commented 8 months ago

Haha, you're really pushing the limits :)

Can you try the new change? It now works on my end

wang3702 commented 8 months ago

Thank you so much! Will test soon.

wang3702 commented 8 months ago

It works, thank you so much for the great help!

3dem / model-angelo

Infinite stop on very small targets #83