facebookresearch / esm

Evolutionary Scale Modeling (esm): Pretrained language models for proteins
MIT License
3.16k stars 627 forks source link

which layer's attention maps are used as template information? #265

Closed zhenyuhe00 closed 2 years ago

zhenyuhe00 commented 2 years ago

Hi, In your ESM-Fold paper: The second change involves the removal of templates. Template information is passed to the model as pairwise distances, input to the residue-pairwise embedding. We simply omit this information, passing instead the attention maps from the language model , as these have been shown to capture structural information well (38) You use attention maps as template information. I wonder it's from the last layer or from all the layers?

Thanks in advance!

zhenyuhe00 commented 2 years ago

duplicate issue due to the network