Open Yuda-Jin opened 2 months ago
which dataset config was used in leaderboard? Should I use forget10_perturbed or just forget10 or retain90? If I use forget10 dataset, how to set perturbed_answer_key and eval_task?
For more specific, which config was used in baselines in leaderboard?
which dataset config was used in leaderboard? Should I use forget10_perturbed or just forget10 or retain90? If I use forget10 dataset, how to set perturbed_answer_key and eval_task?