TransformerLensOrg / TransformerLens

A library for mechanistic interpretability of GPT-style language models
https://transformerlensorg.github.io/TransformerLens/
MIT License
1.17k stars 241 forks source link

Refactor the utilities file into utilities folder #628

Open starship006 opened 4 weeks ago

starship006 commented 4 weeks ago

Description

Addresses #612

As of this comment being written, this only currently just refactors utils - no extra testing has been added.

Type of change

Please delete options that are not relevant.

Checklist:

starship006 commented 4 weeks ago

Okay, currently slightly confused about something. There is weird behavior around USE_DEFAULT_VALUE = None in the utilities. In some parts of the code this is being treated as an optional boolean, and other parts its an optional string.

Right now, we are facing transformer_lens/utilities/exploratory_utils.py:23: error: Cannot determine type of "USE_DEFAULT_VALUE" [has-type]

But, whenever I add the type USE_DEFAULT_VALUE: Optional[bool] = None to it, it causes typing checks downstream to fail, such as:

transformer_lens/HookedTransformer.py:293: error: Argument 3 to "get_attention_mask" has incompatible type "bool | None"; expected "bool"  [arg-type]
transformer_lens/HookedTransformer.py:366: error: Incompatible default for argument "padding_side" (default has type "bool | None", argument has type "Literal['left', 'right'] | None")

@bryce13950 , what is the best way to proceed? is there something obvious here I'm missing?

bryce13950 commented 3 weeks ago

Sorry for not getting back to you earlier. Honestly, I think the whole USE_DEFAULT_VALUE is not adding value to the code, and it is mostly overcomplicating the readability. if prepend_bos is None: is a lot more readable than if prepend_bos is USE_DEFAULT_VALUE:. If you come across something like = USE_DEFAULT_VALUE, then I would just replace it with USE_DEFAULT_VALUE = None. I would like to remove it and replace it with None for all type hinting, but that is a bit beyond the scope of this task. We can pretty safely do it for booleans though.