ajasja / prosculpt

Protein design and sculpting using Rosetta and Deep learning methods (RFDiff and Alphafold2)
0 stars 0 forks source link

Using multiple af_mpnn cycles only keeps one design #20

Open zznidar opened 2 weeks ago

zznidar commented 2 weeks ago

Repro steps

  1. Create multiple desings with rfdiff, e. g. num_designs_rfdiff: 2, and e. g. num_seq_per_target_mpnn: 2
  2. Set cycles to more than two, e. g. af2_mpnn_cycles: 3
  3. Run the script and observe intermediate and final results.

Observed result

During the 0th cycle, you get 2 subfolders in the 3_af2 directory: model_0 and model_1. After that, both models are copied to 2_1_cycle_directory, like this: image This corresponds to two rfdiff desings (rf_0 and rf_1) with two seq_per_target (itr_0 and itr_1)

During the 1st cycle, AF output is still as expected: multiple subfolders in the 3_af2 directory. However, copying the results into the 2_1_cycle_directory already produces unexpected results:

However, after additional cycles, only the last design is kept (all others are lost). There is only one subfolder in the 3_af2 directory, in this case model_1, and only one design (rf_1) is copied to the 2_1_cycle_directory: image

In the same manner, there is only one set of final_pdbs (two files starting with 1.1.1.1)

Expected result

Both desings should be kept throughout the journey and all of them should appear in the final_pdbs. Just the same way they do if af2_mpnn_cycles is set to 1.

I talked to @FAOlivieri about it and we think my understanding/expectation is correct (final output should give num_designs_rfdiff * num_seq_per_target_mpnn pdbs, no matter how many af2_mpnn_cycles are executed). I will provide a simple fix on my branch, but before merging, since it causes a drastic change in the output, I'd like another confirmation that not deleting designs when cycling is indeed what Prosculpt is supposed to do.

ajasja commented 1 week ago

Yes, I agree, doing num_designs_rfdiff: 2 with num_seq_per_target_mpnn: 3 (and a single af2 model) should result in 6 pdb models.