TransformerLensOrg / TransformerLens

A library for mechanistic interpretability of GPT-style language models
https://transformerlensorg.github.io/TransformerLens/
MIT License
1.17k stars 241 forks source link

Move out pretrained weight conversion functions #633

Closed richardkronick closed 1 week ago

richardkronick commented 3 weeks ago

Description

Moving all of the weight conversion functions from loading_from_pretrained.py into their own individual files to improve manageability. Also included one unit test to ensure that the functions are accessible. Future unit tests will be added.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

Screenshots

Please attach before and after screenshots of the change if applicable.

Checklist:

bryce13950 commented 3 weeks ago

One more needs to be moved out, which was added since the work on this started. There is one called t5 that you will see after you pull down the most recent changes on your branch.

richardkronick commented 3 weeks ago

One more needs to be moved out, which was added since the work on this started. There is one called t5 that you will see after you pull down the most recent changes on your branch.

convert_t5_weights has been moved to its own file