Closed norabelrose closed 3 months ago
This PR is a minor refactor of the training code to support training SAEs on the outputs of arbitrary modules, not just the residual stream output.
Other minor changes:
attn_implementation
jaxtyping
This PR is a minor refactor of the training code to support training SAEs on the outputs of arbitrary modules, not just the residual stream output.
Other minor changes:
attn_implementation
entirely since HuggingFace seems to do the right thing by defaultjaxtyping