[todo] positional encoding

lucidrains / alphafold2

To eventually become an unofficial Pytorch implementation / replication of Alphafold2, as details of the architecture get released

MIT License

1.54k stars 256 forks source link

Closed lucidrains closed 3 years ago

lucidrains commented 3 years ago

either use the RPE scheme as in the paper, or see if one can use rotary embeddings where possible, and T5 RPE bias in other places

lucidrains commented 3 years ago

the RPE used in the paper is good, explains why they use it to bias all other attention in the network