TransformerLensOrg / TransformerLens

A library for mechanistic interpretability of GPT-style language models
https://transformerlensorg.github.io/TransformerLens/
MIT License
1.55k stars 301 forks source link

[Proposal] Support BERT Model Series (and/or T5). #258

Closed jbloomAus closed 1 year ago

jbloomAus commented 1 year ago
Screenshot 2023-04-26 at 8 00 40 pm

Proposal

A glaring gap in models supported by TransformerLens is the lack of support for encoder/decoder models like BERT and T5. In order to rectify this, we need to build support for encoder-decoder models, cross attention and likely a few other details.

Motivation

Pitch

In order to complete this card we are going to build a class called HookedEncoderDecoder which can be instantiated and have the BERT or T5 weights from huggingface loaded in. This is the same strategy the package uses for other models so there's nothing fundamentally new, only components we haven't yet built with hooks. Once we have a class that has corresponding weights as needed, we will want to ensure all components are "hooked" correctly, enabling methods like .run_with_cache to work. It's not as complicated as it sounds to get it working.

Possibly the "best" version of this is tricky but we're willing to tolerate some code duplication as long we're confident the implementation is accurate (ie: tests show the hooked activations are correct and inference is identical to what we get without using TransformerLens.

Tasks:

Some things to note:

Additional context

Checklist

neelnanda-io commented 1 year ago

Some caveats: BERT is encoder-only, not encoder-decoder. It should be a HookedBERT or HookedEncoder class, in my opinion.

Note that post-LayerNorm means "LayerNorm applied after the layer". Confusingly, the LayerNormPre class means "LayerNorm with weights folded into the subsequent layer". Pre-LayerNorm when used in papers means LayerNorm at the start of a layer. shrug Naming conventions are hard man

On Wed, 26 Apr 2023 at 11:01, Joseph Bloom @.***> wrote:

[image: Screenshot 2023-04-26 at 8 00 40 pm] https://user-images.githubusercontent.com/69127271/234541975-b5f48a88-0354-4e2c-a060-0deb52d60de8.png Proposal

A glaring gap in models supported by TransformerLens is the lack of support for encoder/decoder models like BERT and T5. In order to rectify this, we need to build support for encoder-decoder models, cross attention and likely a few other details. Motivation

  • I've been asked a number of times by people whether we have T5 or BERT support coming.
  • I think this is a reasonable contribution for people skilling up. You can ask for help or collaborate with other people working on building it.
  • I (Joseph) and likely others will be able to give feedback on your work and pitch in if needed!

Pitch

In order to complete this card we are going to build a class called HookedEncoderDecoder which can be instantiated and have the BERT or T5 weights from huggingface https://huggingface.co/bert-base-uncased loaded in. This is the same strategy the package uses for other models so there's nothing fundamentally new, only components we haven't yet built with hooks. Once we have a class that has corresponding weights as needed, we will want to ensure all components are "hooked" correctly, enabling methods like .run_with_cache to work. It's not as complicated as it sounds to get it working.

Possibly the "best" version of this is tricky but we're willing to tolerate some code duplication as long we're confident the implementation is accurate (ie: tests show the hooked activations are correct and inference is identical to what we get without using TransformerLens.

Tasks:

  • Make a fork for BERT support (keep it synced if building this takes more than a moment)
  • Write some tests to be used during development, these should cover any novel architecture components these will involve during development. I'd start with two placeholder tests. 1. for do you have a model architecture which you can load weights into. and 2. if you had that architecture, would it produce identical results to the hugging-face model.
  • Load the BERT model, look at each component. Maybe read the BERT paper if needed to get context. Make a list of every nn.module that we don't currently support. Some items that will be on the list. 1. a linear block before the unembed, post-layerNorm (can't be unfolded RIP), sinusoidal positional embeddings. T5 might have other stuff.
  • Keep in mind you might need a new config class to configure the entire architecture. If so, build it.
  • If you can pass the test for the model doing inference correctly and show that you can use .run_activation_cache and all the other methods that work on hooked transformer (within reason), then we're likely ready to merge.

Some things to note:

  • You should refine the to do list and add detail as you go.
  • Writing tests should make your life easier and create feedback cycles. It will also make it easier if multiple people want to collaborate on this.
  • A previous PR https://github.com/neelnanda-io/TransformerLens/pull/82 attempted to provide BERT support is here. Unfortunately with drift it was too complicated to Merge. To avoid this, I recommend either syncing regularly or getting someone else to finish the PR if you can't.
  • Possibly quite useful, a link to the original ARENA materials where we did this https://github.com/callummcdougall/arena-v1/blob/main/w2d2/solutions_build_bert.py without the hooks during the bootcamp

Additional context

Checklist

— Reply to this email directly, view it on GitHub https://github.com/neelnanda-io/TransformerLens/issues/258, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASRPNKJUT3735X5MX6SOC7DXDDW7VANCNFSM6AAAAAAXMGIZMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

rusheb commented 1 year ago

I'd like to work on this! I'll start looking into it tomorrow.

alan-cooney commented 1 year ago

@rusheb nice! It would be a good idea to create a simple PR first that splits the components from https://github.com/neelnanda-io/TransformerLens/blob/main/transformer_lens/components.py into different files

neelnanda-io commented 1 year ago

Why? To me that seems worse - having all components in the same place, in a file where you can easy Ctrl+F and jump to definition seems great. And adding some BERTBlocks or CrossAttention components to that file seems totally fine to me

On Thu, 27 Apr 2023 at 07:45, Alan @.***> wrote:

@rusheb https://github.com/rusheb nice! It would be a good idea to create a simple PR first that splits the components from https://github.com/neelnanda-io/TransformerLens/blob/main/transformer_lens/components.py into different files

— Reply to this email directly, view it on GitHub https://github.com/neelnanda-io/TransformerLens/issues/258#issuecomment-1524874813, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASRPNKNFTZ46J5T6NWZWWVTXDIIXLANCNFSM6AAAAAAXMGIZMA . You are receiving this because you commented.Message ID: @.***>

jbloomAus commented 1 year ago

If you are developing and end up with the same files open multiple times, this feels like a tell to me, but if that's not happening it's probably fine. Seems like personal preference?

rusheb commented 1 year ago

Haha, uh oh, I just spent a couple of hours on this.

I see both sides of the argument. I think it's pythonic to have all the components in one file, and I'm also having some issues with getting the imports right when moving them to multiple files. OTOH it is a super long file and might make it easier to factor out duplication if components were organised into files.

In my implementation I was planning to put all components in a components module so they would still be easy to find/search through. And obviously IDEs make searching pretty easy in either scenario.

I'll probably pause on this until we get to a resolution. Branch here if anyone wants to look: https://github.com/rusheb/TransformerLens/tree/rusheb-components

rusheb commented 1 year ago

I think it's minor. If Neel doesn't want to move them then maybe best to just keep everything in the one file and focus on the important parts of this ticket?

alan-cooney commented 1 year ago

Why? To me that seems worse - having all components in the same place, in a file where you can easy Ctrl+F and jump to definition seems great. And adding some BERTBlocks or CrossAttention components to that file seems totally fine to me

Yeah I get it's not crazy yet, although 800 lines is quite a bit for newcomers (as you have to skim a bit to understand what is there). It was really just a suggestion based on the direction of travel with the repo - as the methods start getting split up for unit tests, it's going to get into the thousands.

Anyway fine for leaving as is and sorry about the extra work @rusheb!

alan-cooney commented 1 year ago

For what it's worth if we keep the split we can probably just drop in all these tests - https://github.com/skyhookadventure/transformer-from-scratch/tree/main/transformer_from_scratch/components/tests

rusheb commented 1 year ago

Sneak preview!!

rusheb commented 1 year ago

Based on various offline discussions, the MVP of this will be BERT for Masked Language Modelling -- basically the same architecture as huggingface BertForMaskedLM: the core BERT module plus a language modelling head.

Other tasks/outputs (e.g. embeddings, next sentence prediction, causal language modelling) are being considered as out of scope for the first iteration

fzyzcjy commented 6 months ago

Hi, is there any updates? I do hope T5 can be supported. Thanks!

neelnanda-io commented 6 months ago

BERT support was added, but T5 was not, I believe (see HookedEncoder). We'd welcome a PR adding T5 if you're interested! I'm not sure there's the capacity to add it ourselves. What's the use case?

On Sat, 4 May 2024 at 02:12, fzyzcjy @.***> wrote:

Hi, is there any updates? I do hope T5 can be supported. Thanks!

— Reply to this email directly, view it on GitHub https://github.com/neelnanda-io/TransformerLens/issues/258#issuecomment-2093937021, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASRPNKLH64QS6HCNOLS2QE3ZAQYXBAVCNFSM6AAAAAAXMGIZMCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJTHEZTOMBSGE . You are receiving this because you commented.Message ID: @.***>

fzyzcjy commented 6 months ago

@neelnanda-io Thank you! I am interested and hope to squeeze some time trying that (but cannot guarantee).

What's the use case?

You know, many tasks such as machine translations uses encoder-decoder architecture instead of decoder-only.

neelnanda-io commented 6 months ago

Makes sense! Note that the nnsight library works out of the box for any PyTorch model. It's more lightweight than TransformerLens accordingly, but likely still useful

On Sat, 4 May 2024, 4:09 am fzyzcjy, @.***> wrote:

@neelnanda-io https://github.com/neelnanda-io Thank you! I am interested and hope to squeeze some time trying that (but cannot guarantee).

What's the use case?

You know, many tasks such as machine translations uses encoder-decoder architecture instead of decoder-only.

— Reply to this email directly, view it on GitHub https://github.com/neelnanda-io/TransformerLens/issues/258#issuecomment-2093979949, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASRPNKJV3LV5V2HRF2NS4UTZARGOJAVCNFSM6AAAAAAXMGIZMCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJTHE3TSOJUHE . You are receiving this because you were mentioned.Message ID: @.***>

fzyzcjy commented 6 months ago

I see. Thank you!