nishantsubramani / steering_vectors

Steering Vector Repo from "Extracting Latent Steering Vectors from Pretrained Language Models" - ACL2022 Findings
10 stars 0 forks source link

Steering without modifying GPT code? #4

Closed jbmaxwell closed 1 year ago

jbmaxwell commented 1 year ago

I see in the paper that you're passing x, z_steer, IL, and IT to the model's forward function. So I'm guessing you must have modified your GPT... Is that correct?

jbmaxwell commented 1 year ago

I was able to do get something running using forward hooks.