GPT-3 prefers certain multiple choice options

aogara-ds / hoodwinked-website

A text-based game where language models learn to lie and to detect lies.

11 stars 1 forks source link

Pretrained language models are known to prefer certain kinds of answers for spurious reasons. Zhao 2021 shows that GPT-3 prefers certain answer choices to multiple choice questions when the question is "N/A". Fixing this miscalibration can strongly improve performance in multiple choice QA.

Our GPT-3 action agent suffers from the same problem. When the killer is given seven answer choices, the last four of which involve killing someone, it chooses answer number 4 with unreasonably high probability.

To solve this, we could fine-tune GPT-3 with the calibration technique proposed in Zhao 2021, or we could just run RL training.

aogara-ds / hoodwinked-website

GPT-3 prefers certain multiple choice options #5