Closed frankaging closed 8 months ago
updates: its hard to find the raw activation addition, and i will probably do a model weight diff by loading https://huggingface.co/likenneth/honest_llama2_chat_7B and the original one to get head diff and then apply.
the original implementation is with BauKit to do the intervention, i am hoping to show we can save the weight diff along with intervention config so ppl can apply to act diff directly.
Descriptions:
Interventions on activations at inference to steer model behaviors are good applications of this library. It fits the ultimate goal of this library well. Ideally, people should be able to share their steering mounting point along with injecting vectors with others easily.
Original GitHub: https://github.com/likenneth/honest_llama