Closed DLiquor closed 10 months ago
Hello. The "generation accuracy" is measured from your specific truthfulness benchmark. I guess you are measuring it with your own benchmark, since TruthfulQA doesn't work for the Chinese language? The "internal accuracy" is from linear-probing all attention heads and take the max validation set accuracy (4:1 split).
Thanks for your reply!
Hi, thanks for your work! I have tried ITI with Baichuan on my Chinese dataset. However, it dose not work well for more truthfulness. Thus, I want to know whether there is a gap in my scenario. You mentioned the gap of 40% between generation acc and probe acc. I am wondering how to calculate them? Dose the generation acc mean the true% of the baseline? For the probe acc, Is there specific binary classifier used for each attention head? like 32*32 for llama? Hope for your reply!