Open noowad93 opened 2 months ago
ccing @NathanHB @clefourrier for discretion over changing the official leaderboard IFEval task definition!
google/ifeval
for our ifeval
non-leaderboard task--perhaps that should be carried over to the leaderboard variant regardless of whether that fixes this issue as well?
https://github.com/EleutherAI/lm-evaluation-harness/blob/8138fd52437dcd8c76ac87bdc9d684840e794c42/lm_eval/tasks/leaderboard/ifeval/instructions.py#L1384
the updated IFEval dataset (https://www.oxen.ai/wis-k/instruction-following-eval/file/main/instruction-following-eval_train.parquet) now includes letters like "!", "#", which are considered true under the conditions mentioned. As a result, the letters may be changed to random characters.