BytedProtein / ByProt

Apache License 2.0
159 stars 16 forks source link

Structural adapter query vector questions and decoding #1

Closed simonlevine closed 1 year ago

simonlevine commented 1 year ago

Hi, I have a few questions:

Thank you very much.

zhengzx-nlp commented 1 year ago

Hi Simon,

Sorry I didn't get notified by github regarding your comment! I would answer your questions below:

  1. We intend to design our framework to be agnostic to specific structure encoding parameterization choices. We thus used established and opensource SoTA protein structure models, such as ProteinMPNN, PiFold, and ESM-IF's GVPTransformerEncoder. In these models, protein structure/graph representation is built using both node and edge (pair) features. And we simply used the protein structure's final "node" representation as the structural representation to feed to the pLM decoder.
  2. Yes, dimension matching is needed, which is accomplished by the linear projectors for K, V in the Adapter's self-attention module, where the query is the pLM hidden state and key/value is the structure encoding. The (structural) key and value (e.g., 256) are scaled up to match the query dimension (e.g., 768) of the pLM.
  3. As mentioned in Q1, we only used node representations that are supposed to have properly encoded protein structural/spatial hierarchy.
  4. Actually structural encoding is the key/value, and pLM hidden is the query. I am gonna check if something confusing I might have made in the manuscript. Thank you for bringing this to our attention!
  5. We applied RoPE to query (of LM) and key (of structural encoding). The RoPE was used in order to explicitly consider the relative positions of residues, thus better understanding of their spatial relationship. This was found to be helpful.

I hope I was able to help address your questions! Please feel free to add additional comments if you have any further questions.

Best, Zaixiang

simonlevine commented 1 year ago

Hi @zhengzx-nlp, thanks for your response. That all makes sense, and yes, as to 4., there is a section on page 15 of the manuscript that reads "...the structural adapter composes a multihead attention (MULTIHEAD ATTN) that queries structure information from the structure encoder...". This would seem to imply Q ~ Structure, rather than pLM.