Open HannesThurnherr opened 2 months ago
Can you add a link to the models on HuggingFace, and a link to the source code? Most likely, you will be able to utilize the majority of existing component to add this, but there are going to need to be some new components created.
My personal inclination would be to just make this into another repo that builds on TransformerLens. What's the case for making this part of the core repo?
On Wed, 14 Aug 2024, 05:40 Hannes Thurnherr, @.***> wrote:
Proposal
Add support for TracrBench transformers Motivation
I and @JeremyAlain https://github.com/JeremyAlain recently wrote a paper in which we introduced a dataset of 121 tracr-transformers. Tracr transformers are meant to be used as test beds or "sanity-checks" in the development of novel interpretability methods. To make them as accessible as possible we convert from the deepmind-internal "haiku" framework to Hooked Transformers (following this https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/Tracr_to_Transformer_Lens_Demo.ipynbtemplate made by Neel). We would like and have been asked by multiple people to make these toy models available from within transformerlens. Pitch
We have all the models uploaded to huggingface and i have code to load the models. It's a little different from the code used to load the typical LLMs. Since the model requires input and output encoders, we wrap the hooked transformer class in another simple class called "TracrModel".
My question is, whether this is possible and if so, where to put this code/the tracr_models.py file. Alternatives
An alternative would be to integrate the code to download the tracr models for use withing transformerlens in another repo.
— Reply to this email directly, view it on GitHub https://github.com/TransformerLensOrg/TransformerLens/issues/704, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASRPNKPH5ACQJMAU2452TTLZRNF3BAVCNFSM6AAAAABMQILXB2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ3DKNZSGY2DKMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Can you add a link to the models on HuggingFace, and a link to the source code? Most likely, you will be able to utilize the majority of existing component to add this, but there are going to need to be some new components created.
I've added the links to the issue. The code is in the "another repo" link, linking to our TracrBench repo.
My personal inclination would be to just make this into another repo that builds on TransformerLens. What's the case for making this part of the core repo?
We are happy to make it into its own repo. The case for making it part of TransformerLens is that the point of the dataset and the paper was to make the use of tracr for evaluating interp methods as easy as possible. Integrating this directly into TransformerLens would really help with that. If we make it into its own repo, maybe the project could be mentioned in the docs somewhere?
Proposal
Add support for TracrBench transformers
Motivation
I and @JeremyAlain recently wrote a paper in which we introduced a dataset of 121 tracr-transformers. Tracr transformers are meant to be used as test beds or "sanity-checks" in the development of novel interpretability methods. To make them as accessible as possible we convert from the deepmind-internal "haiku" framework to Hooked Transformers (following this template made by Neel). We would like and have been asked by multiple people to make these toy models available from within transformerlens.
Pitch
We have all the models uploaded to huggingface and i have code to load the models. It's a little different from the code used to load the typical LLMs. Since the model requires input and output encoders, we wrap the hooked transformer class in another simple class called "TracrModel".
My question is, whether this is possible and if so, where to put this code/the tracr_models.py file.
Alternatives
An alternative would be to integrate the code to download the tracr models for use within transformerlens in another repo.