gregorbachmann / Next-Token-Failures

59 stars 3 forks source link

Trying it on GPT4 #1

Open stefanmohl opened 4 months ago

stefanmohl commented 4 months ago

I didn't find a forum in this repo, so I put this in an issue instead; I hope that is OK.

Just for fun, I tried this on GPT4. I copied your example into a graphvis node list and just fed it into the openai web interface. As expected, going from a "tip" to the "middle" was easy, but surprisingly GPT4 immediately guessed the right path each time when going from middle to tip. When asked to reason step-by-step it clearly had a fake post-fact explanation, e.g. "the node we were going to had a low number so I took the lowest number [exit from the middle]", same explanation for a high number exit from the middle, and when the tip node number didn't match the exit node number, it just said something along the lines of "best exit". Along with youtuber "Tunadorable", we figured out that the issue was that my graphvis node-list was written in-order, with each "arm" consisting of an ordered list of node links, and each arm following the next, i.e. enabling a sort of "Clever Hans cheat" during inference.

I put the list of nodes through shuf and GPT4 still managed to solve the problem! This time it had to work harder (I forbade it from using tools): It executed a Breadth First Search manually, starting from the middle node and going outwards. Clearly, next token prediction is a hard limit, but by writing out the operations of a BFS search, the context became a temporary storage that let it perfectly keep track of it's progress through the BFS and perform the operation even with just a single token of prediction.

An interesting question is: What would a human have done, confronted with a randomised list of nodes and no image of the graph?

gregorbachmann commented 4 months ago

Hi!

Thanks for reading the paper in so much detail and trying out the examples! Very interesting findings!

It's funny, I made the same mistake first as well and forgot to randomize the adjacency list, which also allowed the model to easily figure out the problem!

It's interesting that GPT4 can solve this, and I guess it's kind of tough to tell why it is able to do that since so little is known regarding pre-training data as well as post-training "refinements" (e.g. RLHF etc) and wrappers. I could for instance imagine that the model might pick up useful planning biases along the way (especially in the refinement period), it might even be explicitly trained to solve such planning tasks. Pretty cool finding that it implements a BFS search :) In our case, the models were also not allowed to put out extra tokens that could be used as a storage, I guess GPT4 is also more flexible this way. Still, quite cool!

Regarding your last point, very interesting question. I personally would probably first follow a random path and check whether it leads to the goal (i.e. DFS) and backtrack if not.