This PR ports features from my transformerlens-model-table repo to TransformerLens, implementing many of the features requested in #97. I still need some feedback on this, and presumably building docs will fail for one reason or another once I make the PR.
Features:
The static table has a few more fields added to it, but the primary focus is the interactive table. This provides:
information on parallel attn/mlps, positional embeddings, and other config elements
filtering and searching on any column (i.e. sort by parameter count and only show standard positional embeddings)
links back to the huggingface model page, where applicable (extracted from the "official model name")
tokenizer information, including vocab hash (need feedback on if there is a better way to do this)
full config in title text or new window
organized view of dimensions of all tensors in state dict and activation cache (via setting device to meta, doesn't require actually loading models)
[WIP]
Description
This PR ports features from my transformerlens-model-table repo to TransformerLens, implementing many of the features requested in #97. I still need some feedback on this, and presumably building docs will fail for one reason or another once I make the PR.
Features:
The static table has a few more fields added to it, but the primary focus is the interactive table. This provides:
meta
, doesn't require actually loading models)Adds dependencies
under group
docs
:tiktoken
for dealing with certain tokenizersmuutils
for pretty-printed data on tensor shapesType of change
Screenshots
Before:
Original model properties table
After (static):
You can see what the generated data looks like here
After (interactive):
See demo
Checklist:
(currently draft PR, testing incomplete)