peterljq / Parsimonious-Concept-Engineering

Parsimonious Concept Engineering (PaCE) uses sparse coding on a large-scale concept dictionary to effectively improve the trustworthiness of Large Language Models by precisely controlling and modifying their neural activations.
25 stars 1 forks source link

experiment code #1

Closed wjw136 closed 2 months ago

wjw136 commented 3 months ago

Thank you for your inspiring work and for releasing your dataset~ However, I couldn't find the code for the PaCE experiments. Could you please release the experimental code as well?

peterljq commented 3 months ago

Hi William,

Many thanks for your interest in our work! We have released the concept dataset in advance and we are actively structuring the code and scrutinizing LLM's responses in recent weeks. The complete pipeline will be released after ethical evaluation soon. If you have any questions during this period, please feel free to contact me through email or wechat, whichever you prefer. Thanks again!

Regards,

Jinqi