Open Tincsvsv opened 8 months ago
adapting some code in attentioner_for_train.py, i.e. adding gptj_attn and GPTJattentionermanager (like those in attentioner_for_attribution.py) should be sufficient. For hardware requirements, you might need to set GPT-J to float16 to run in a 40GB gpu. (I have not try this before, that's just an estimation)
Hello, if I apply anchor reweighting on GPT-J, what parts should I modify? What are the hardware requirements? Looking forward to your reply, thank you very much!