The way model_surgery.py is currently implemented appears to be causing a lot of issues #57 #56. It should be rewritten to be a class that can be configured to have explicit rules for looking up each model component in each model class. It should fail load and with useful errors.
Furthermore, it should be extensible so that ideally we can eventually integrate the lens with non-hugging face transformers like the built-in PyTorch transformer. Basically, it should act as an interface between the tuned-lens code and the transformer.
The way
model_surgery.py
is currently implemented appears to be causing a lot of issues #57 #56. It should be rewritten to be a class that can be configured to have explicit rules for looking up each model component in each model class. It should fail load and with useful errors.Furthermore, it should be extensible so that ideally we can eventually integrate the lens with non-hugging face transformers like the built-in PyTorch transformer. Basically, it should act as an interface between the
tuned-lens
code and the transformer.