andyzoujm / representation-engineering

Representation Engineering: A Top-Down Approach to AI Transparency
https://www.ai-transparency.org/
MIT License
716 stars 86 forks source link

How to automate the [threshold] parameter in Honest example? #24

Closed poa010101 closed 9 months ago

poa010101 commented 11 months ago

We realized we have to manually adjust the [threshold] parameter in Honest example in order to get acceptable results. We want to know what is the plan to automate the [threshold] parameter adjustment? It is not very user friendly and not very practical as well.

andyzoujm commented 11 months ago

Perhaps this could help: https://github.com/andyzoujm/representation-engineering/issues/22#issue-2004070656

poa010101 commented 11 months ago

Very helpful, thanks Andy