Closed duytai closed 3 months ago
Locality calculates whether the output for irrelevant questions changes before and after the edit, so the correctness of the answer is not important.
Hi, do you have any further issues?
@XeeKee thanks for your answer @zxlzr no, thank you so much
Thank you for your efforts in developing an excellent tool. I understand how to compute Reliability and Generalization, but I am confused about how Locality is calculated.
The ground truth is provided with the ZsRE dataset. However, it should vary across different models (such as GPT-2 XL, LLaMA 2).
I have tried a record of ZsRE:
However, "Downhill" is not the correct ground truth answer to the question, "nq question: types of skiing in the winter olympics 2018" in GPT2-XL.
Could you please clarify how Locality is computed and address my confusion regarding the ground truth?