Descriptions:
Currently, all the models need to have a config, and the config needs to be inheriting transformer library's config. The model config is only used in,
We should however allow config-less models where the dimension is directly read-off from the config dict, or dynamically figure out using some helper functions.
Case 2:
pyvene dynamically figures out the input and output dimensions of all the modules. (checkout torch.compile)
Meanwhile, the intervening component can accept arbitrary model component string e.g. model.h[2].attn.c_proj.output, we can dynamically figure the component out.
Descriptions: Currently, all the models need to have a config, and the config needs to be inheriting transformer library's config. The model config is only used in,
to get the components dimension.
We should however allow config-less models where the dimension is directly read-off from the config dict, or dynamically figure out using some helper functions.
torch.compile
)Meanwhile, the intervening component can accept arbitrary model component string e.g.
model.h[2].attn.c_proj.output
, we can dynamically figure the component out.