EleutherAI / concept-erasure

Erasing concepts from neural representations with provable guarantees
MIT License
209 stars 15 forks source link

Oracle LEACE implementation #2

Closed norabelrose closed 1 year ago

norabelrose commented 1 year ago

New least squares concept erasure method for the common case when ground truth concept labels are available at inference time image image Implemented as OracleFitter and OracleEraser