issues
search
EleutherAI
/
sae-auto-interp
https://blog.eleuther.ai/autointerp/
Apache License 2.0
97
stars
11
forks
source link
[Experiments] - Do human recall and rubric scoring
#10
Closed
SrGonao
closed
3 months ago
SrGonao
commented
4 months ago
[ ] Make an "interface" that shows us the same the model sees to decide.
[ ] Score (50?) artifical explanations
[ ] Score (50?) human explanations