sokrypton / ColabDesign

Making Protein Design accessible to all via Google Colab!
549 stars 127 forks source link

Can we see generated sequences other than final one? #123

Open Hyunjoo-ibs opened 1 year ago

Hyunjoo-ibs commented 1 year ago

Hi Thank you for your great work! I wonder if there are any way to see sequences that are generated during each run! I am running AFdesign in colab at this moment (link below): https://colab.research.google.com/github/sokrypton/ColabDesign/blob/main/af/examples/afdesign_hotspot_test.ipynb and I found out final output is not always the best one.. I would like to know if there are any other way that i can get sequence & pdbs for each models is it possible to get sequence & pdbs for each models in below image..? model.get_seqs() only gives final sequence ;(...

image

please help me! Thank you!!

gieses commented 1 year ago

Some clumsy way to get the sequences and a summary df could be:

import pandas as pd
from colabdesign.af.alphafold.common import residue_constants
# processign / model fitting
model = () ....

# collect results
log_df = pd.DataFrame(model._tmp["log"])
trajectory = pd.DataFrame(model._tmp["traj"])

seqs_argmax = []
seqs_aa = []
for seq_enc in trajectory["seq"]:
     seqs_argmax.append(seq_enc.argmax(-1)[0])
     seqs_aa.append("".join([order_aa[a] for a in seqs_argmax[-1]]))
log_df["sequences"] = seqs_aa

I dont know how robust this is for #num_recycles, num_models, hard/soft iterations changed.