wells-wood-research / timed-design

Protein Sequence Design with Deep Learning and Tooling like Monte Carlo Sampling and Analysis
48 stars 11 forks source link

[Feedback] Redesign Specific Residues only #78

Open universvm opened 8 months ago

universvm commented 8 months ago

I would love to be able to specify residues to design. Or full regions. If I have a structure, and I want to be able to (re)design a loop.

sunal1996 commented 6 months ago

Working on it at the moment. Managed to implement it for single structures and single chains, and there is a clear path for doing it for multiple structures. However, what if the user wants to specifically fix residues in chain B, C, D etc.? It appears to me that the chain information gets lost when we generate frame datasets, and TIMED puts everything into chain A. If this is correct, I think it gives us two paths:

a) User has to accept that instead of specifying the chain B 15th residue, they will have to specify X+15th residue where X is the length of chain A. Or, if they want to change chain D 3rd residue, they will have to specify the X+Y+Z+3rd residue where X-Y-Z are the lengths of chain A-B-C b) We will have to figure out a way to not lose this information in frame generation. Later, instead of TIMED outputting the sequence into just one chain (x_A where x is the protein name and A is chain A), maybe it should give x_A, x_B, x_C separately for each chain.

Thoughts? @universvm

universvm commented 6 months ago

@sunal1996 Output should include chains as we store them in the frames. This is likely a bug #79 . Once that's fixed we should be able to use your implementation :)