Closed genepearl closed 4 months ago
Hi,
I assume you got this error as the assembly process has failed. This usually means that the models were not diverse enough and so the assembly algorithm was not able to combine different interactions to join all 26 subunits.
A few questions that will help me better understand:
{output_path}/_unified_representation/assembly_output/output.log
so that I could better identify the issue?Thank you for such a quick response.
This did not work and led to this error
--- Searching for subunits in supplied PDB files
found full A0 in A0_A0_98374_unrelaxed_rank_004_alphafold2_ptm_model_5_seed_000.pdb chain A
found full A0 in A0_A0_A0_cb42a_unrelaxed_rank_002_alphafold2_ptm_model_4_seed_000.pdb chain A
found full A0 in A0_A0_A0_cb42a_unrelaxed_rank_003_alphafold2_ptm_model_3_seed_000.pdb chain A
found full A0 in A0_A0_A0_cb42a_unrelaxed_rank_005_alphafold2_ptm_model_5_seed_000.pdb chain A
found full A0 in A0_A0_A0_cb42a_unrelaxed_rank_004_alphafold2_ptm_model_2_seed_000.pdb chain A
found full A0 in A4_be333_unrelaxed_rank_005_alphafold2_ptm_model_2_seed_000.pdb chain A
found full A0 in A0_A0_98374_unrelaxed_rank_003_alphafold2_ptm_model_4_seed_000.pdb chain A
found full A0 in A0_A0_A0_cb42a_unrelaxed_rank_001_alphafold2_ptm_model_1_seed_000.pdb chain A
found full A0 in A0_A0_98374_unrelaxed_rank_001_alphafold2_ptm_model_1_seed_000.pdb chain A
found full A0 in A4_be333_unrelaxed_rank_003_alphafold2_ptm_model_5_seed_000.pdb chain A
found full A0 in A0_A0_98374_unrelaxed_rank_005_alphafold2_ptm_model_2_seed_000.pdb chain A
found full A0 in A4_be333_unrelaxed_rank_002_alphafold2_ptm_model_1_seed_000.pdb chain A
found full A0 in A4_be333_unrelaxed_rank_001_alphafold2_ptm_model_3_seed_000.pdb chain A
found full A0 in A0_A0_98374_unrelaxed_rank_002_alphafold2_ptm_model_3_seed_000.pdb chain A
found full A0 in A4_be333_unrelaxed_rank_004_alphafold2_ptm_model_4_seed_000.pdb chain A
--- Extracting representative subunits (for each subunit, its best scored model in the PDBs folder)
rep A0 has plddt score 61.587373737373746
--- Extracting pairwise transformations between subunits (from each PDB file with 2 or more subunits)
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_98374_unrelaxed_rank_004_alphafold2_ptm_model_5_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_A0_cb42a_unrelaxed_rank_002_alphafold2_ptm_model_4_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_A0_cb42a_unrelaxed_rank_003_alphafold2_ptm_model_3_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_A0_cb42a_unrelaxed_rank_005_alphafold2_ptm_model_5_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_A0_cb42a_unrelaxed_rank_004_alphafold2_ptm_model_2_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A4_be333_unrelaxed_rank_005_alphafold2_ptm_model_2_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_98374_unrelaxed_rank_003_alphafold2_ptm_model_4_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_A0_cb42a_unrelaxed_rank_001_alphafold2_ptm_model_1_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_98374_unrelaxed_rank_001_alphafold2_ptm_model_1_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A4_be333_unrelaxed_rank_003_alphafold2_ptm_model_5_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_98374_unrelaxed_rank_005_alphafold2_ptm_model_2_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A4_be333_unrelaxed_rank_002_alphafold2_ptm_model_1_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A4_be333_unrelaxed_rank_001_alphafold2_ptm_model_3_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A0_A0_98374_unrelaxed_rank_002_alphafold2_ptm_model_3_seed_000.pdb
- Extracting pairwise transformations from file /content/CombFold-master/custom/pdbs/A4_be333_unrelaxed_rank_004_alphafold2_ptm_model_4_seed_000.pdb
--- Finished building unified representation
--- Running combinatorial assembly algorithm, may take a while
--- Finished combinatorial assembly, writing output models
Could not assemble, exiting
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
[<ipython-input-8-c171dffdddcc>](https://localhost:8080/#) in <cell line: 40>()
38 max_results_number=int(max_results_number))
39
---> 40 shutil.copytree(os.path.join(tmp_assembled_folder, "assembled_results"),
41 assembled_folder)
42
[/usr/lib/python3.10/shutil.py](https://localhost:8080/#) in copytree(src, dst, symlinks, ignore, copy_function, ignore_dangling_symlinks, dirs_exist_ok)
555 """
556 sys.audit("shutil.copytree", src, dst)
--> 557 with os.scandir(src) as itr:
558 entries = list(itr)
559 return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
FileNotFoundError: [Errno 2] No such file or directory: '/content/tmp_assembled/assembled_results'
Hi,
In your use case, it is much better to keep all subunits identical and not try to force heteromeric configuration, and also it will not work if you don't have at least one model with the complete altered subunit(including additional G).
First of all, notice that the logs you provided are different from the log on
{output_path}/_unified_representation/assembly_output/output.log
so please supply it as well.
Another issue I can see in the logs is that it seems that the subunit A0 appear only once in each model. for example in the line:
found full A0 in A0_A0_98374_unrelaxed_rank_004_alphafold2_ptm_model_5_seed_000.pdb chain A
I would expect to see another line after that:
found full A0 in A0_A0_98374_unrelaxed_rank_004_alphafold2_ptm_model_5_seed_000.pdb chain B
Is there actually a chain named B (or something else other than A) that has an identical sequence to A0 in that model?
Another 2 tips that may yield better results:
Hi,
I appreciate the guidance you've provided. Following your advice, I've opted to move away from using forced heteromeric configurations in favor of homomeric configurations. Consequently, there's no longer a need to review the log at {output_path}/_unified_representation/assembly_output/output.log
I'm now focusing on exploring the application of your tool for homomeric complexes, specifically with the sequence: "FTEEEIKKIRESLKLSVEALEVTPKDFEKALELLEEVAINLMEIFKDDPMKALKIAFKFTNAIAKLYVAHESKDVADAMAIMAEVTKYILEILEKVLEE." I'm interested in understanding the process for predicting structures that comprise of 13 (or 26) copies of this sequence. Could you clarify if utilizing a single subunit is the optimal strategy? Moreover, how can diversity among the models be maintained when using only one type of subunit? Your detailed explanation of these aspects would be greatly appreciated.
Well, it seems that this subunit is pretty small, so actually I think a better approach would be to use AFM directly on either 13 or 26 copies of the subunit, as it should be pretty accurate and shouldn't require many resources (you can probably even do this in Colab). For homomers with a small subunit, it is likely that AFM won't be able to predict the dimer interaction that forms symmetry accurately when given only a subcomplex, so CombFold is less likely to work.
If you are still looking for an assembly-based approach, you can use tools like SymDock that takes a single copy of the subunit structure and the number of copies and find possible symmetric structures of that size.
Hope this helps!
Hi,
I hope this message finds you well. I'm currently working on predicting a complex that consists of a single subunit and 26 copies of it. Unfortunately, I've encountered a
FileNotFoundError: [Errno 2] No such file or directory: '/content/tmp_assembled/assembled_results'
error during the process.As a workaround, I attempted to modify the structure by adding an extra G and extra GG to each of the subunits accordingly, aiming to differentiate them and potentially bypass the issue. My json-file ended up looking like this:
{ "A0": { "name": "A0", "chain_names": [ "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M" ], "start_res": 1, "sequence": "FTEEEIKKIRESLKLSVEALEVTPKDFEKALELLEEVAINLMEIFKDDPMKALKIAFKFTNAIAKLYVAHESKDVADAMAIMAEVTKYILEILEKVLEEG" }, "G0": { "name": "G0", "chain_names": [ "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" ], "start_res": 1, "sequence": "FTEEEIKKIRESLKLSVEALEVTPKDFEKALELLEEVAINLMEIFKDDPMKALKIAFKFTNAIAKLYVAHESKDVADAMAIMAEVTKYILEILEKVLEEGG" } }
I generated pdbs for each of the subunits using AFM. However, this approach resulted in the same error.
Could you please provide guidance on how to resolve this? Any assistance would be greatly appreciated.
Thank you