EleutherAI / concept-erasure

Erasing concepts from neural representations with provable guarantees
MIT License
209 stars 15 forks source link

Applying this during decoding time #12

Open rahulseetharaman opened 8 months ago

rahulseetharaman commented 8 months ago

Hi, thanks for repository and paper. Is it possible to apply this to generation tasks in language models and not just classification ? I am very interested in this aspect. Also, just to confirm, the scrubber is a technique that is applied during inference and doesn't modify model parameters right ? It only modifies hidden representations ?