whitebox-research / excursions

1 stars 1 forks source link

How do you train a sleeper agent model? #3

Closed MostDeadDeveloper closed 1 month ago

MostDeadDeveloper commented 2 months ago

https://github.com/whitebox-research/sae-alignment-interp/issues/14

MostDeadDeveloper commented 1 month ago

Solved by this issue and PR: https://github.com/whitebox-research/sae-alignment-interp/issues/16