Attention of instruction-tuned GPTs

jwergieluk / revllm

RevLLM -- Reverse Engineering Tools for Large Language Models

https://revllm.dev

MIT License

11 stars 2 forks source link

Attention of instruction-tuned GPTs #2

Open Bachstelze opened 3 months ago

Bachstelze commented 3 months ago

How does the attention look in instruction-tuned GPTs? reference: https://github.com/jessevig/bertviz/issues/128

jwergieluk commented 3 months ago

Good question: The library builds upon nanoGPT, so instruction-tuned GPT2 models shouldn't be a problem (probably a matter of updating the list of available models). Which model are interested in specifically?

iBibek commented 3 months ago

LLAMA-2 series models

jwergieluk commented 3 months ago

llama2 are not gpt2 models, AFAIK. So not supported at the moment.