Closed avilella closed 2 years ago
That seems like a totally reasonable thing to do. It'll just be a matter of setting it up right. See the paper Hsu et al. 2022 subsection on protein complexes, for one way to set this up: by concatenating together with 10 mask tokens between chains. Other creative ways may be possible, feel free to share and discuss in the Discussions tab of this repo! Also note that the model is predicting in a regime it has not been trained in, see comment in paper Table 4 and associated section.
Is it possible to use the log likelihood script to calculate the joined log likelihood of 2 input fasta chains against the 3rd chain in a pdb file?
E.g. If the pdb has chains H, L, and A, could we use the script to feed in the fasta sequences of a query H + a query L of the same length as H and L, then get the likelihood against chain A in the pdb?
If the answer is "not easily", could we somehow re-write the pdb to pretend that H+L are the same chain, maybe by renaming HL and adding a link in the pdb file between the two, then run the log likelihood on "HL" vs the edited pdb?
Thanks