dina-lab3D / CombFold

Apache License 2.0
68 stars 12 forks source link

Could not assemble, exiting #7

Open Anupam-5 opened 4 months ago

Anupam-5 commented 4 months ago

Hi, I hope this finds you well. I am working on a protein with 6 subunits and 2 copies of each. I have made the subunits.json file in as prescribed format.

{ "AE": { "name": "AE", "chain_names": [ "A", "E" ], "start_res": 1, "sequence": "KPHRYRPGTVALREIRRYQKSTELLIRKLPFQRLVREIAQDFKTDLRFQSSAVMALQEASEAYLVALFEDTNLCAIHAKRVTIMPKDIQLARRIRGER" }, "BF": { "name": "BF", "chain_names": [ "B", "F" ], "start_res": 1, "sequence": "RHRKVLRDNIQGITKPAIRRLARRGGVKRISGLIYEETRGVLKVFLENVIRDAVTYTEHAKRKTVTAMDVVYALKRQGRTLYGFGG" }, "CG": { "name": "CG", "chain_names": [ "C", "G" ], "start_res": 1, "sequence": "TRSSRAGLQFPVGRVHRLLRKGNYAERVGAGAPVYLAAVLEYLTAEILELAGNAARDNKKTRIIPRHLQLAVRNDEELNKLLGRVTIAQGGVLPNIQSVLLPKK" }, "DH": { "name": "DH", "chain_names": [ "D", "H" ], "start_res": 1, "sequence": "KTRKESYAIYVYKVLKQVHPDTGISSKAMSIMNSFVNDVFERIAGEASRLAHYNKRSTITSREIQTAVRLLLPGELAKHAVSEGTKAVTKYTSAK" }, "KM": { "name": "KM", "chain_names": [ "K", "M" ], "start_res": 1, "sequence": "TTRIKITELNPHLMCVLCGGYFIDATTIIECLHSFCKTCIVRYLETSKYCPICDVQVHKTRPLLNIRSDKTLQDIVYKLVPGLFKNEMKRRRYADAA" }, "LN": { "name": "LN", "chain_names": [ "L", "N" ], "start_res": 1, "sequence": "KTWELSLYELQRTPQEAITDGLEIVVSPRSLHSELMCPICLDMLKNTMTTKECLHRFCADCIITALRSGNKECPTCRKKLVSKRSLRPDPNFDALISKIYPSGSGSRSALKRINKELSDLARDPPAQCSAGPVGDDMFHWQATIMGPNDSPYQGGVFFLTIHFPTDYPFKPPKVAFTTRIYHPNINSNGSICLDILRSQWSPALTISKVLLSICSLLCDPNPDDPLVPEIARIYKTDRDKYNRISREWTQKYAM" } }

And then predicted the paired fasta structures using AFM.

But after this when I am trying to execute run_on_pdbs.py scripts it is passing following error

--- Finished building unified representation --- Running combinatorial assembly algorithm, may take a while --- Finished combinatorial assembly, writing output models Could not assemble, exiting

please help with your guidance to resolve this error, what could be the possible solutions for it to work smoothly.

ben-shor commented 4 months ago

Hi, please supply the file: {output_path}/_unified_representation/assembly_output/output.log so that I can see what is the reason for failure (and optionally also the output of run_on_pdbs) Also, if you haven't run the optional stage of groups prediction - this also may help.

Anupam-5 commented 4 months ago

Thank you for instant response. Here I am providing the output log file and the output result of run_on_pdbs. Additionally, I have not executed the optional stage of groups prediction. Thank you again.

output.log

--- Searching for subunits in supplied PDB files found full BF in BF_CG_a0d77_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full CG in BF_CG_a0d77_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full AE in AE_DH_2c7eb_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full DH in AE_DH_2c7eb_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full AE in AE_CG_4e888_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain A found full CG in AE_CG_4e888_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain B found full DH in DH_LN_b27e6_unrelaxed_rank_002_alphafold2_multimer_v3_model_4_seed_000.pdb chain A found full LN in DH_LN_b27e6_unrelaxed_rank_002_alphafold2_multimer_v3_model_4_seed_000.pdb chain B found full CG in CG_CG_b7458_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain A found full CG in CG_CG_b7458_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain B found full CG in CG_KM_7c34d_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain A found full KM in CG_KM_7c34d_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain B found full CG in CG_LN_6b177_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full LN in CG_LN_6b177_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full CG in CG_CG_b7458_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain A found full CG in CG_CG_b7458_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain B found full KM in KM_LN_2eda9_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full LN in KM_LN_2eda9_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full CG in CG_DH_47f88_unrelaxed_rank_002_alphafold2_multimer_v3_model_1_seed_000.pdb chain A found full DH in CG_DH_47f88_unrelaxed_rank_002_alphafold2_multimer_v3_model_1_seed_000.pdb chain B found full BF in BF_DH_38db5_unrelaxed_rank_002_alphafold2_multimer_v3_model_4_seed_000.pdb chain A found full DH in BF_DH_38db5_unrelaxed_rank_002_alphafold2_multimer_v3_model_4_seed_000.pdb chain B found full AE in AE_CG_4e888_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full CG in AE_CG_4e888_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full BF in BF_CG_a0d77_unrelaxed_rank_001_alphafold2_multimer_v3_model_2_seed_000.pdb chain A found full CG in BF_CG_a0d77_unrelaxed_rank_001_alphafold2_multimer_v3_model_2_seed_000.pdb chain B found full DH in DH_KM_653a8_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain A found full KM in DH_KM_653a8_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain B found full DH in DH_LN_b27e6_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full LN in DH_LN_b27e6_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full AE in AEAE_63f12_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain A found full AE in AEAE_63f12_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain B found full BF in BF_KM_0d375_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain A found full KM in BF_KM_0d375_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain B found full AE in AE_LN_3ec6e_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain A found full LN in AE_LN_3ec6e_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain B found full LN in LN_LN_d6a2c_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain A found full LN in LN_LN_d6a2c_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain B found full BF in BF_KM_0d375_unrelaxed_rank_002_alphafold2_multimer_v3_model_3_seed_000.pdb chain A found full KM in BF_KM_0d375_unrelaxed_rank_002_alphafold2_multimer_v3_model_3_seed_000.pdb chain B found full AE in AE_DH_2c7eb_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain A found full DH in AE_DH_2c7eb_unrelaxed_rank_002_alphafold2_multimer_v3_model_2_seed_000.pdb chain B found full KM in KM_LN_2eda9_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain A found full LN in KM_LN_2eda9_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain B found full DH in DH_DH_a5f22_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full DH in DH_DH_a5f22_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full CG in CG_DH_47f88_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full DH in CG_DH_47f88_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full CG in CG_KM_7c34d_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain A found full KM in CG_KM_7c34d_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain B found full BF in BF_LN_f1c85_unrelaxed_rank_002_alphafold2_multimer_v3_model_4_seed_000.pdb chain A found full LN in BF_LN_f1c85_unrelaxed_rank_002_alphafold2_multimer_v3_model_4_seed_000.pdb chain B found full BF in BF_BF_4c822_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full BF in BF_BF_4c822_unrelaxed_rank_002_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full AE in AE_KM_9faec_unrelaxed_rank_002_alphafold2_multimer_v3_model_1_seed_000.pdb chain A found full KM in AE_KM_9faec_unrelaxed_rank_002_alphafold2_multimer_v3_model_1_seed_000.pdb chain B found full AE in AE_BF_2dafd_unrelaxed_rank_002_alphafold2_multimer_v3_model_3_seed_000.pdb chain A found full BF in AE_BF_2dafd_unrelaxed_rank_002_alphafold2_multimer_v3_model_3_seed_000.pdb chain B found full LN in LN_LN_d6a2c_unrelaxed_rank_002_alphafold2_multimer_v3_model_3_seed_000.pdb chain A found full LN in LN_LN_d6a2c_unrelaxed_rank_002_alphafold2_multimer_v3_model_3_seed_000.pdb chain B found full AE in AE_BF_2dafd_unrelaxed_rank_001_alphafold2_multimer_v3_model_2_seed_000.pdb chain A found full BF in AE_BF_2dafd_unrelaxed_rank_001_alphafold2_multimer_v3_model_2_seed_000.pdb chain B found full KM in KM_KM_dec28_unrelaxed_rank_002_alphafold2_multimer_v3_model_4_seed_000.pdb chain A found full KM in KM_KM_dec28_unrelaxed_rank_002_alphafold2_multimer_v3_model_4_seed_000.pdb chain B found full AE in AE_KM_9faec_unrelaxed_rank_001_alphafold2_multimer_v3_model_3_seed_000.pdb chain A found full KM in AE_KM_9faec_unrelaxed_rank_001_alphafold2_multimer_v3_model_3_seed_000.pdb chain B found full BF in BF_DH_38db5_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full DH in BF_DH_38db5_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full BF in BF_LN_f1c85_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full LN in BF_LN_f1c85_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full DH in DH_DH_a5f22_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain A found full DH in DH_DH_a5f22_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain B found full DH in DH_KM_653a8_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain A found full KM in DH_KM_653a8_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain B found full AE in AEAE_63f12_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain A found full AE in AEAE_63f12_unrelaxed_rank_001_alphafold2_multimer_v3_model_5_seed_000.pdb chain B found full AE in AE_LN_3ec6e_unrelaxed_rank_002_alphafold2_multimer_v3_model_3_seed_000.pdb chain A found full LN in AE_LN_3ec6e_unrelaxed_rank_002_alphafold2_multimer_v3_model_3_seed_000.pdb chain B found full BF in BF_BF_4c822_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain A found full BF in BF_BF_4c822_unrelaxed_rank_001_alphafold2_multimer_v3_model_1_seed_000.pdb chain B found full CG in CG_LN_6b177_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain A found full LN in CG_LN_6b177_unrelaxed_rank_001_alphafold2_multimer_v3_model_4_seed_000.pdb chain B found full KM in KM_KM_dec28_unrelaxed_rank_001_alphafold2_multimer_v3_model_2_seed_000.pdb chain A found full KM in KM_KM_dec28_unrelaxed_rank_001_alphafold2_multimer_v3_model_2_seed_000.pdb chain B --- Extracting representative subunits (for each subunit, its best scored model in the PDBs folder) rep BF has plddt score 87.23639534883719 rep CG has plddt score 85.90913461538466 rep AE has plddt score 89.66724489795918 rep DH has plddt score 94.6384210526316 rep LN has plddt score 85.92444881889763 rep KM has plddt score 90.2383505154639 --- Extracting pairwise transformations between subunits (from each PDB file with 2 or more subunits)

ben-shor commented 4 months ago

It indeed seems that the assembly failed after creating assemblies of size 5, as it was not able to continue adding subunits to them, probably because the correct interactions were not in any of the supplied AFM models. I would suggest trying to supply additional models for each pair (all 5 instead of 2) and even better - the optional stage to generate models for groups. This would provide more interactions for CombFold to use in the assembly.

Also, as this seems to be not a very large complex (~1500 aa) you can try running the complete complex in AFM, and look at the confidence. It may be simpler. If the confidence is low or you have another reason to believe it is not the correct structure, you can also provide this full structure as one of the pdbs to CombFold, and this may result in better performance.