chaidiscovery / chai-lab

Chai-1, SOTA model for biomolecular structure prediction
https://www.chaidiscovery.com
Other
1.28k stars 159 forks source link

how to use structural templates? #101

Open phbradley opened 1 month ago

phbradley commented 1 month ago

Hi there, I can't seem to find any examples that show how to create template features. We are trying to dock two proteins, one of which has an experimentally determined structure. The default inference script yields a poor prediction of that monomer's internal conformation, and a low-confidence dock. I was thinking that including template information for that monomer might improve the docking. Any little example (doesn't have to be extensive or perfectly documented) would be a big help. Thank you!

jackdent commented 1 month ago

Hi Philip--we haven't got round to releasing code for templates yet. We still need to add some logic to populate the TemplateContext.

phbradley commented 1 month ago

OK, thanks for the info. I will look forward to the release!

ParthBibekar commented 1 month ago

Hi @jackdent is there any update on this?

jackdent commented 1 month ago

Not yet. We haven't got round to it.

arogozhnikov commented 1 week ago

@phbradley examples for restraints were recently merged, and you can adapt those to condition folding on information about monomers

altaetran commented 6 days ago

@arogozhnikov Is this really the case though? The code seem to reference restraints only for inter chain contacts for training, rather than intra chain contacts, but maybe I am mistaken.

arogozhnikov commented 6 days ago

just try with intra-chain, according to external feedback it works too

altaetran commented 6 days ago

Thanks i'll give it a try!

phbradley commented 4 days ago

Hi there, these restraints will definitely be useful, thank you! But I think with just a tiny bit of information I can fill out the TemplateContext myself (and others too). In particular, within TemplateContext,

template_distances[t,i,j] -- is this the Calpha-Calpha distance (or pseudo Cbeta-Cbeta?) between aligned rsds i and j in template t? template_unit_vector[t,i,j] -- is this the unit vector from rsd i to rsd j? If so, is it defined between Calpha atoms? And is it in the global reference frame or in some kind of local reference frame defined by the residue frame of rsd i (or j)?

The other info seems mostly self-explanatory. Thanks for any info! Happy to share code back here if it works.

@jackdent @arogozhnikov

MattMcPartlon commented 4 days ago

template_distances[t,i,j] -- is this the Calpha-Calpha distance (or pseudo Cbeta-Cbeta?) between aligned rsds i and j in template t?

pseudo CB

template_unit_vector[t,i,j] -- is this the unit vector from rsd i to rsd j? If so, is it defined between Calpha atoms? And is it in the global reference frame or in some kind of local reference frame defined by the residue frame of rsd i (or j)?

We use the local reference frame, yes. If you need a quick way to produce these see here

rigids = Rigid.make_transform_from_reference(n_pos, ca_pos, c_pos)
template_vecs = rigids[..., None].invert_apply(rigids.get_trans())
template_unit_vecs = template_vecs/torch.norm(eps+template_vecs,dim=-1,keepdim=True)
phbradley commented 3 days ago

Awesome, thanks so much @MattMcPartlon ! One gotcha, there was a bug in openfold in that function (make_transform_from_reference) until this commit:

https://github.com/aqlaboratory/openfold/commit/4bd1b4d548dcf43dc40c1cc2ee4af0b3678b370a#diff-4bb8b92ccd6ab38684f733efb68c443fc117bfd99bfb0cc4fe8633c53506c308

For anyone else doing this, see Matt's code up above for making the template unit vectors. The template_distances are between pseudo_CB atoms (CA for glycine, CB for everything else, see openfold/utils/feats.py:pseudo_beta_fn