TransformerLensOrg / TransformerLens

A library for mechanistic interpretability of GPT-style language models
https://transformerlensorg.github.io/TransformerLens/
MIT License
1.17k stars 241 forks source link

Add tests for ActivationCache #643

Closed FlyingPumba closed 1 week ago

FlyingPumba commented 2 weeks ago

Description

I've added more tests for ActivationCache, increasing coverage from 64% to 95%. In the process I also fixed some minor bugs arising in border cases.

Type of change

Please delete options that are not relevant.

Screenshots

image

Checklist:

FlyingPumba commented 2 weeks ago

There are some checks failing (Compatibility) in the PR, but they seem unrelated to these changes.

bryce13950 commented 2 weeks ago

The failures do appear to be related to the changes unfortunately. Specifically, it has something to do with your typing. I would check your use of the pipe (|) on different versions of python. On python 3.8, the support is basically non-existent. For 3.9, it's not quite as dramatic, but there are still certain uses that were not supported. We need to at least support 3.9, but I do not mind dropping support for 3.8. See if you can get it to run on 3.9, and I will take a look into dropping support for 3.8.

FlyingPumba commented 2 weeks ago

This was indeed my mistake. It didn't occur to me that the union typing using pipes (e.g., A | B) would have limited support in Python 3.9. I have trivially changed them to use Union (as the rest of the project does). The compatibility checks should now work for both Python 3.8 and 3.9

bryce13950 commented 1 week ago

Great! A very important question for you before this gets merged. I see you are changing the typing for the shape of the tensor returned from accumulated_bias. What was the reason for that? If people are writing code expecting the current shape returned from that function, then changing it could break their code, thus requiring a major release. I see that you didn't actually change the internals of the function. Is this a case of that return type being incorrect before you revised the test?

FlyingPumba commented 1 week ago

The typing was indeed incorrect for the accumulated_bias method. You can see here that the initial shape for the returning tensor is d_model, and then the method proceeds to just sum the biases, which are all d_model size. Also, the "Returns" in the docstring for the method says that the output is a Tensor of shape [d_model]. I found this incorrect type when I used accumulated_bias in one of the new tests.