dtch1997 / tms-kit

Toy models of superposition
3 stars 0 forks source link

TMS

tests

5_2_superposition

Codebase for quickly implementing experiments with toy models of superposition.

Quickstart

To set up the environment, see setup docs for detailed instructions.

To reproduce figures from Anthropic's Toy Models of Superposition, see sample script

Acknowledgements

This codebase is heavily adapted from the ARENA 3.0 codebase, designed and maintained by Callum McDougall. Many thanks to Callum and the ARENA team!