Adamliu1 / SNLP_GCW

3 stars 0 forks source link

Check length of responses+whitespace ratio+repeating question for Retain and Unlearn sets #123

Closed Willmish closed 3 months ago

Willmish commented 3 months ago

Previously only checked length of responses and whitespace ratio for the test set of the unlearning set (beavertails), we should also check the length on a test set for the retain set (Squad test set, TruthfulQA). Additionally, check if LLM is repeating the questions (e.g. substring match...)

Hypotheses:

  1. Lengths of responses for the unlearn set are shorter than for the retain set. (Using test set splits)
  2. Whitespace ratio for the unlearn set is higher than for the retain set. (Using test set splits)
  3. Repeating of questions in LLM generation happens more often for the unlearn set than for the retain set. (Using test set splits)