Integrate rotary positional encodings

csjackson0 commented 1 year ago

This PR integrates rotary positional encodings into the APT model per issue #21

Created a new dir "utils" under "modeling" with a "init.py" and "rotary_embedding.py" file (used this file for instantiation)
Adapted the APT model class to support rotary embeddings in the attention functions.
Verified that code runs by setting position_embedding = "rotary" in the config.py file and ran the train.py script.

I will keep the default positional_embeddings to "grouped_alibi"

talkhanz commented 1 year ago

nice work @csjackson0! @pascalnotin my latest modeling PR (less verbose) would likely give a merge conflict since the code is different (I imported GPT2Attention rather than code up an APTAttention which @csjackson0 has made changes in).

Perhaps we should merge @csjackson0's PR first and i'll revisit my modeling branch to acommodate his changes

In the mean time, i'll revisit my modeling branch and have the APTAttention And APTBlock included in my less verbose version?

How does that sound? @pascalnotin

pascalnotin commented 1 year ago

Lgtm @csjackson0 - nice work! And sounds good regarding suggested plan @talkhanz! Merging this PR into main.

pascalnotin commented 1 year ago

@othertea - fyi

OpenBioML / protein-lm-scaling

Integrate rotary positional encodings #23