FailSpy / abliterator

Simple Python library/structure to ablate features in LLMs which are supported by TransformerLens
MIT License
215 stars 21 forks source link

Automation of orthogonalization #2

Open FailSpy opened 1 month ago

FailSpy commented 1 month ago

A lot of the measuring tools are there, just need to apply them in a clever way.