cpldcpu / MisguidedAttention

A collection of prompts to challenge the reasoning abilities of large language models in presence of misguiding information
Creative Commons Zero v1.0 Universal
58 stars 1 forks source link

This is likely not misguided attention #2

Closed DrChristophFH closed 4 months ago

DrChristophFH commented 4 months ago

I tested the prompts a bit and I get much better accuracy by simply adding this prefix:

Solve exactly this riddle:

which suggests that the models don't struggle with actually answering the riddle but very likely due to instruction fine-tuning and human error in training data (humans don't always cite problems correctly, especially when only partially remembering something and asking for clarification from others as presumably makes up most of the training data (various stack exchanges, reddit, etc...)) are trimmed to recognize more likely and common intentions of prompts, instead of exactly following the given prompt word by word.

For me this seems to more likely be a side effect from instruction fine-tuning and adapting to human error made in prompts (/training data) than some advanced reason that has to do with the models reasoning capabilities.

EDIT: At least this seems to be the case for some of the prompts. More interesting are prompts where the response does not reference any known problem, but is just a regular answer.

I found this to happen with the prompt There is a man, a sheep and a boat with space for one human and one animal on one side of a river. How do the man and sheep get to the other side of the river in as few trips as possible? which has a very obvious solution of only one trip but GPT-4o fails to provide any correct solution zero-shot. With varying trip counts of 2-7.

cpldcpu commented 4 months ago

You are probably right, it would make a lot of sense if this was caused by the instruction finetuning. The given problems are presented in a way very similar to training samples, so the LLM is strongly biased (maybe "overfitted") towards the answer of the sample. Adding additional text/instructions removes that bias.

Since all of these are well known problems, they are probably occuring multiple times in the training data. Especially the river-problem could be something that is given as a basic example of logical problem solving.

cpldcpu commented 4 months ago

There is a man, a sheep and a boat with space for one human and one animal on one side of a river. How do the man and sheep get to the other side of the river?

It even works without asking to optimize for number of trips.

DrChristophFH commented 4 months ago

They took overfitting seriously 😂

DrChristophFH commented 4 months ago
A man and a thing on one side of the river and a boat with space for both. How can he cross the river with the thing?

This also works 💀 lul