sokrypton / ColabDesign

Making Protein Design accessible to all via Google Colab!
549 stars 127 forks source link

Minimize sequence length while preserving some amount of structural stability. #120

Open seyonechithrananda opened 1 year ago

seyonechithrananda commented 1 year ago

Hey @sokrypton! Thanks for developing this package, its been incredibly useful to work with and learn from.

I'm currently working on a project where we aim to minimize the sequence length by introducing deletions in specific domains of the protein, while preserving structural stability/binding. We have a large amount of enrichment data from large scale deletion screens, so I was originally thinking about testing out trying to train a regression model and use that to rank deletion variants (or use ESM as a zero-shot variant effect predictor), but I came across a lot of your conditional hallucination work! I was wondering if you had any pointers regarding ways to penalize sequence length while preserving specific domains/structure overall in a conditional hallucination/inpainting task?