3dem / model-angelo

Automatic atomic model building program for cryo-EM maps
MIT License
110 stars 18 forks source link

Build ends in error when using ModelAngelo 1.0.1 #66

Open Appoota opened 10 months ago

Appoota commented 10 months ago

I'm trying to generate a model using a map generated in cryosparc and protein fasta file. The process runs through the Initial C-alpha prediction phase and then terminates, leaving behind an empty output directory. When the log file is inspected, it shows an error.

Here are the contents of the logfile

2023-08-28 at 16:28:35 | INFO | ModelAngelo with args: {'volume_path': 'cryosparc_P5_J11_009_volume_map_sharp.mrc', 'protein_fasta': '1cgm.fasta', 'rna_fasta': None, 'dna_fasta': None, 'output_dir': 'output', 'mask_path': None, 'device': None, 'config_path': None, 'model_bundle_name': 'nucleotides', 'model_bundle_path': None, 'keep_intermediate_results': False, 'pipeline_control': False, 'func': <function main at 0x7f3634ad6710>}
2023-08-28 at 16:28:35 | INFO | Initial C-alpha prediction with args: {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/home/ppdlabnew/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': 'cryosparc_P5_J11_009_volume_map_sharp.mrc', 'output_path': 'output/see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}
2023-08-28 at 16:28:35 | INFO | Using model file /home/ppdlabnew/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/model.py
2023-08-28 at 16:28:35 | INFO | Using checkpoint file /home/ppdlabnew/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/chkpt.torch
2023-08-28 at 16:28:37 | INFO | Input structure has shape: (230, 230, 230)
2023-08-28 at 16:28:37 | INFO | Running with these arguments:
2023-08-28 at 16:28:37 | INFO | {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/home/ppdlabnew/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': 'cryosparc_P5_J11_009_volume_map_sharp.mrc', 'output_path': 'output/see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}
2023-08-28 at 16:35:35 | INFO | Model prediction done, took 417.64 seconds for 1331 sliding windows
2023-08-28 at 16:35:35 | INFO | Average time is 313.779 ms
2023-08-28 at 16:35:35 | INFO | Starting Cα grid to points...
2023-08-28 at 16:35:37 | INFO | Have 47875 Cα points before pruning and 30514 after pruning
2023-08-28 at 16:35:38 | ERROR | Error in ModelAngelo
Traceback (most recent call last):

  File "/home/ppdlabnew/anaconda3/envs/model_angelo/bin/model_angelo", line 33, in <module>
    sys.exit(load_entry_point('model-angelo==1.0.1', 'console_scripts', 'model_angelo')())
    │   │    └ <function importlib_load_entry_point at 0x7f3779d63d90>
    │   └ <built-in function exit>
    └ <module 'sys' (built-in)>
  File "/home/ppdlabnew/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.1-py3.10.egg/model_angelo/__main__.py", line 52, in main
    args.func(args)
    │    │    └ Namespace(volume_path='cryosparc_P5_J11_009_volume_map_sharp.mrc', protein_fasta='1cgm.fasta', rna_fasta=None, dna_fasta=None...
    │    └ <function main at 0x7f3634ad6710>
    └ Namespace(volume_path='cryosparc_P5_J11_009_volume_map_sharp.mrc', protein_fasta='1cgm.fasta', rna_fasta=None, dna_fasta=None...
> File "/home/ppdlabnew/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.1-py3.10.egg/model_angelo/apps/build.py", line 207, in main
    ca_cif_path = c_alpha_infer(ca_infer_args)
                  │             └ {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'th...
                  └ <function infer at 0x7f3634de05e0>
  File "/home/ppdlabnew/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.1-py3.10.egg/model_angelo/c_alpha/inference.py", line 286, in infer
    points_to_pdb(
    └ <function points_to_pdb at 0x7f3634dc3760>
  File "/home/ppdlabnew/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.1-py3.10.egg/model_angelo/utils/save_pdb_utils.py", line 95, in points_to_pdb
    save_structure_to_cif(struct, path_to_save)
    │                     │       └ 'output/see_alpha_output/output_ca_points_before_pruning.cif'
    │                     └ <Structure id=1>
    └ <function save_structure_to_cif at 0x7f3634dc36d0>
  File "/home/ppdlabnew/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.1-py3.10.egg/model_angelo/utils/save_pdb_utils.py", line 77, in save_structure_to_cif
    auth_seq_id = deepcopy(io.dic["_atom_site.label_seq_id"])
                  │        └ <Bio.PDB.mmcifio.MMCIFIO object at 0x7f36333a6230>
                  └ <function deepcopy at 0x7f3744f50c10>

AttributeError: 'MMCIFIO' object has no attribute 'dic'

Any help is immensely appreciated. Let me know if any other information is to be shared.

jamaliki commented 10 months ago

Hi,

I'm very sorry, this is due to a recent change which seems to have broken things. Could you try pulling the latest code and running python setup.py install in your model_angelo environment? If you need help with this, please let me know.

Best, Kiarash.

Appoota commented 9 months ago

Thanks for your prompt reply @jamaliki, sorry I couldn't get back to you earlier.

It did resolve the issue immediately and we were able to generate a 3D model.

However, I have a couple of (seemingly unrelated) questions:

Thanks again!

jamaliki commented 9 months ago

Hi @Appoota ,

The default colouring in ChimeraX is based on chains, I believe. So it is saying that these are separate chains. Does that make sense?

As for the RNA, when resolution is not very high (higher than 3 A), RNA assignment becomes very difficult. That is why you might have fragmented outputs. There is not much to do other than trying to improve the resolution in this case.

Best, Kiarash.

Appoota commented 9 months ago

Hi @jamaliki ,

Thanks for answering. Yes, that does clear my doubt. And yes, you're right about the resolution. Our final resolution was 3.02 A, which explains why the RNA is fragmented. Thank you again!