generatebio / chroma

A generative model for programmable protein design
Apache License 2.0
627 stars 74 forks source link

De-novo binder generation #14

Closed kushnarang closed 7 months ago

kushnarang commented 7 months ago

The supplement of the original Nature paper suggests that De-novo binders could be generated by "Combine (i) substructure conditioning on antigen, (ii) optional scaffold constraint on binder, and (iii) contact constraints on epitope/paratope."

From my current understanding of Chroma, I am unsure on how to use an existing antigen as the template for a de novo binder. I've been able to successfully generate re-designs on existing binders using code similar to #9 but de novo is proving more of a challenge. In practice, what API(s) should one use to ask Chroma to design a brand new chain in complex with an existing antigen?

As a second question, the "(iii) contact constraints" should ostensibly be implemented via the "Substructure distances conditioner" (pg 80, Supp Table 6). My understanding of this module is that it would allow the user to provide Chroma with a pre-specified binding location, akin to hotspots in RFDiffusion. I recognize all the other conditioners from the source code, but I can't find the substructure distances module. Has that one already been implemented/open-sourced, or does it exist as a mathematical template for now until its implemented in Python?

vuhongai commented 7 months ago

Hi, For me the opposite in fact. De novo binder design can be done directly by substructure conditioner. You just need to modify the pdb file a bit, by adding new chain (pseudo-chain) with dummy coordinate. Here's the colab example.

kushnarang commented 7 months ago

Hi, thank you so much for the code example. Would you be able to share how you added a psuedo chain to the PDB, or maybe share an example? I am a little new to the PDB format.

kushnarang commented 7 months ago

And, have you found a way to identify particular "hotspots" for the receptor? Essentially, how to specify which receptor residues Chroma should try to diffuse the binder near?

Thank you so much!

vuhongai commented 7 months ago

o share how you added a psuedo chain to the PDB, or maybe share an example? I am a little new to the PDB format.

The code is included in the colab example.

`protein = Protein(f"{input_dir}/6wrw_A.pdb", device=device) # input pdb should be clean before, in this case only the chain A of pdb 6WRW X, C, S = protein.to_XCS()

len_binder = 100 # define length of your binder

X_new = torch.cat( [X, torch.zeros(1, len_binder, 4, 3).cuda() ], dim=1 )

C_new = torch.cat( [C, torch.full((1, len_binder), 2).cuda() ], dim=1 )

S_new = torch.cat( [S, torch.full((1, len_binder), 0).cuda() ], dim=1 )

del X,C,S

protein = Protein(X_new, C_new, S_new, device=device) X, C, S = protein.to_XCS()`

Samuel-gwb commented 7 months ago

Hi, For me the opposite in fact. De novo binder design can be done directly by substructure conditioner. You just need to modify the pdb file a bit, by adding new chain (pseudo-chain) with dummy coordinate. Here's the colab example.

Thanks for sharing of the binder-designing code. I met one problem when using it, that the receptor sequence was changed in the final result. Any idea ? Thanks !

Samuel-gwb commented 7 months ago

Problem solved. Modification of the code to: ################################# for i in range(L_complex): if i < L_receptor: mask_aa[i] = torch.Tensor([0] 20) mask_aa[i][S[0][i].item()] = 1 else:
mask_aa[i] = torch.Tensor([1]
20) #################################

Also ask: how to set hot spot residues in the receptor for binder design?

wujiewang commented 7 months ago

Thanks for all the discussions and feedbacks! I will close this issue for now, feel free to reopen or post new issues if you feel the need.