AlignmentResearch / tuned-lens

Tools for understanding how transformer predictions are built layer-by-layer
https://tuned-lens.readthedocs.io/en/latest/
MIT License
432 stars 47 forks source link

Refactor model surgery to no longer use heuristics #59

Closed levmckinney closed 1 year ago

levmckinney commented 1 year ago

The way model_surgery.py is currently implemented appears to be causing a lot of issues #57 #56. It should be rewritten to be a class that can be configured to have explicit rules for looking up each model component in each model class. It should fail load and with useful errors.

Furthermore, it should be extensible so that ideally we can eventually integrate the lens with non-hugging face transformers like the built-in PyTorch transformer. Basically, it should act as an interface between the tuned-lens code and the transformer.

levmckinney commented 1 year ago

On obvious solution to this would be to have the tuned-lens project depend on the TransformerLens repo. And simply use the hooked transformer class.