lancopku / label-words-are-anchors

Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
MIT License
144 stars 12 forks source link

anchor reweighting on GPT-J #10

Open Tincsvsv opened 8 months ago

Tincsvsv commented 8 months ago

Hello, if I apply anchor reweighting on GPT-J, what parts should I modify? What are the hardware requirements? Looking forward to your reply, thank you very much!

leanwang326 commented 8 months ago

adapting some code in attentioner_for_train.py, i.e. adding gptj_attn and GPTJattentionermanager (like those in attentioner_for_attribution.py) should be sufficient. For hardware requirements, you might need to set GPT-J to float16 to run in a 40GB gpu. (I have not try this before, that's just an estimation)