3dem / model-angelo

Automatic atomic model building program for cryo-EM maps
MIT License
116 stars 18 forks source link

error in running model-angelo. #33

Closed jiangq9992003 closed 1 year ago

jiangq9992003 commented 1 year ago

Hi, I encountered an error when running it in a Linux box with miniconda3 installed in my home directory. The error is in the next. Please help fix it. Thanks.

2023-01-31 at 23:42:26 | ERROR | Error in ModelAngelo Traceback (most recent call last):

File "/home/qxjiang/miniconda3/envs/model_angelo/bin/model_angelo", line 33, in sys.exit(load_entry_point('model-angelo==0.2.2', 'console_scripts', 'model_angelo')()) â â â <function importlib_load_entry_point at 0x2b5180891160> â â â <module 'sys' (built-in)> File "/home/qxjiang/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-0.2.2-py3.9.egg/model_angelo/main.py", line 51, in main args.func(args) â â â Namespace(volume_path='bCHGBmap.mrc', fasta_path='bCHGBseq.fasta', output_dir='output', mask_path=None, device='cuda:0', conf... â â <function main at 0x2b5226dff1f0> â Namespace(volume_path='bCHGBmap.mrc', fasta_path='bCHGBseq.fasta', output_dir='output', mask_path=None, device='cuda:0', conf...

File "/home/qxjiang/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-0.2.2-py3.9.egg/model_angelo/apps/build.py", line 225, in main gnn_output = gnn_infer(gnn_infer_args) â â {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 3, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': Tru... â <function infer at 0x2b5226dff160> File "/home/qxjiang/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-0.2.2-py3.9.egg/model_angelo/gnn/inference.py", line 243, in infer protein = get_lm_embeddings_for_protein(lang_model, batch_converter, protein) â â â â Protein(atom_positions=None, atom14_positions=None, aatype=None, atom_mask=None, atom14_mask=None, residueindex=None, chain... â â â <esm.data.BatchConverter object at 0x2b5229630ac0> â â ProteinBertModel( â (embed_tokens): Embedding(33, 1280, padding_idx=1) â (layers): ModuleList( â (0): TransformerLayer( â ... â <function get_lm_embeddings_for_protein at 0x2b5225fe7e50> File "/home/qxjiang/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-0.2.2-py3.9.egg/model_angelo/data/generate_complete_prot_files.py", line 32, in get_lm_embeddings_for_protein [result[s]["representations"][33].cpu().numpy() for s in seq_names], â â ['0'] â {} File "/home/qxjiang/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-0.2.2-py3.9.egg/model_angelo/data/generate_complete_prot_files.py", line 32, in [result[s]["representations"][33].cpu().numpy() for s in seq_names], â â â '0' â â '0' â {}

KeyError: '0'

jamaliki commented 1 year ago

Hi,

There seems to be a problem with your FASTA file. It should look like this 8D1T.

Best, Kiarash.

jiangq9992003 commented 1 year ago

Thanks. My fasta file was in the same format and was used in Phenix without error. Does it need two identical chains for the homodimer in the map? Qiu-Xing Sent via the Samsung Galaxy S10+, an AT&T 5G Evolution capable smartphone Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Kiarash Jamali @.> Sent: Wednesday, February 1, 2023 6:09:02 AM To: 3dem/model-angelo @.> Cc: Jiang,Qiu-Xing @.>; Author @.> Subject: Re: [3dem/model-angelo] error in running model-angelo. (Issue #33)

[External Email]

Hi,

There seems to be a problem with your FASTA file. It should look like this 8D1Thttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.rcsb.org%2Ffasta%2Fentry%2F8D1T&data=05%7C01%7Cqxjiang%40ufl.edu%7Cedf5dbf3c5c240605ec308db0444b9e7%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638108465446068834%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=lq3idSYu6D6AITEyDStnGs6YLv7C75TqvX1B8HesrjM%3D&reserved=0.

Best, Kiarash.

— Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2F3dem%2Fmodel-angelo%2Fissues%2F33%23issuecomment-1411881277&data=05%7C01%7Cqxjiang%40ufl.edu%7Cedf5dbf3c5c240605ec308db0444b9e7%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638108465446068834%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HRVMZiS%2F%2BqXNkIur%2Bdhn1iIS9yvvh8SHYa3urMNM5DQ%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAKBPPPENM7JQS5SOJTGJUETWVI745ANCNFSM6AAAAAAUNIDTCY&data=05%7C01%7Cqxjiang%40ufl.edu%7Cedf5dbf3c5c240605ec308db0444b9e7%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638108465446068834%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mgq94GQ8ynMJrloLhLPisfr9SL0aZCU5%2BwfWG1t15Lc%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>

jiangq9992003 commented 1 year ago

I saw a problem. The seq needs all upper-case characters. Qx

jamaliki commented 1 year ago

Great! Let me know if there are any other problems. Btw, you don't need to provide identical chains as separate entries.

/Kiarash

jiangq9992003 commented 1 year ago

Thanks. I got it run through with one copy. Qiu-Xing

From: Kiarash Jamali @.> Reply-To: 3dem/model-angelo @.> Date: Wednesday, February 1, 2023 at 10:13 AM To: 3dem/model-angelo @.> Cc: "Jiang,Qiu-Xing" @.>, Author @.***> Subject: Re: [3dem/model-angelo] error in running model-angelo. (Issue #33)

[External Email]

Great! Let me know if there are any other problems. Btw, you don't need to provide identical chains as separate entries.

/Kiarash

— Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2F3dem%2Fmodel-angelo%2Fissues%2F33%23issuecomment-1412226547&data=05%7C01%7Cqxjiang%40ufl.edu%7C433f88b2003b4a3edd2708db0466ec77%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638108612329626913%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=F9mLsm48E0QcFnnABdu5FnGhHTdXdUgLM4hqMa%2FJx5o%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAKBPPPDPPYC3CDBUBDAEN3DWVJ4S5ANCNFSM6AAAAAAUNIDTCY&data=05%7C01%7Cqxjiang%40ufl.edu%7C433f88b2003b4a3edd2708db0466ec77%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638108612329626913%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7cQYxG%2B%2B9MG%2BjCDG5GGiTedbCROrZN80lUJOta8O32w%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>

jiangq9992003 commented 1 year ago

Hi Kiarash, In the output/, the out_raw.cif contains a lot more residues identified than the out.cif. Is there higher confidence in the out.cif results than the output_raw.cif, or these two are from different stages of the analysis? Thanks. Qiu-Xing

From: Kiarash Jamali @.> Reply-To: 3dem/model-angelo @.> Date: Wednesday, February 1, 2023 at 10:13 AM To: 3dem/model-angelo @.> Cc: "Jiang,Qiu-Xing" @.>, Author @.***> Subject: Re: [3dem/model-angelo] error in running model-angelo. (Issue #33)

[External Email]

Great! Let me know if there are any other problems. Btw, you don't need to provide identical chains as separate entries.

/Kiarash

— Reply to this email directly, view it on GitHubhttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2F3dem%2Fmodel-angelo%2Fissues%2F33%23issuecomment-1412226547&data=05%7C01%7Cqxjiang%40ufl.edu%7C433f88b2003b4a3edd2708db0466ec77%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638108612329626913%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=F9mLsm48E0QcFnnABdu5FnGhHTdXdUgLM4hqMa%2FJx5o%3D&reserved=0, or unsubscribehttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAKBPPPDPPYC3CDBUBDAEN3DWVJ4S5ANCNFSM6AAAAAAUNIDTCY&data=05%7C01%7Cqxjiang%40ufl.edu%7C433f88b2003b4a3edd2708db0466ec77%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638108612329626913%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7cQYxG%2B%2B9MG%2BjCDG5GGiTedbCROrZN80lUJOta8O32w%3D&reserved=0. You are receiving this because you authored the thread.Message ID: @.***>

jamaliki commented 1 year ago

The output_raw.cif file also contains residues that were not found in your sequence file, where as the output.cif file only contains residues that could be aligned with the sequence file. If you provided all the sequences that you had in the fasta file, then the residues in output_raw.cif are lower confidence ones.