Open HanClinto opened 7 months ago
Sorry for replying late. We have discussed some of the emerging strategies on Twitter. Besides the discussion, I just came up with an interesting case when the attacker directly says "I know the word! It is (wrong word)", which can mislead the defender to recognize the wrong word as the target... However, we have banned this trick as a rule-breaking case during the outcome judgment...
Your suggestion about more experimental analyzing is quite constructive. We are still working on more experimental results and will consider your suggestions. Thank you so much!
Excellent paper, thank you so much for publishing this!
Reading through the examples in the end of the paper, it feels almost like the Taboo game can be meta-gamed a bit. I.E., if the defending never says anything other than "say more, please?" -- then eventually they will gather all they need. Thinking of the "panda" example here.
Did you notice that playthroughs of the game resulted in "cheap" strategies like this emerging, or did the opportunity for deceiving the defender (and getting them to guess wrong) eventually win-out?
Feels like the next optimal play would be for the attacker to give extremely strong hints that are completely disconnected from the target word entirely, in an effort to get the defender to consistently guess the wrong word.
Once that happens, the attacker is just outputting noise, and it feels like the optimal next step in emergent strategies would be for the defender to say as little as possible. Without a penalty for dragging the game out for a long time, it feels like eventually the games would stall out into the realm of "the only winning move is not to play".
Did you happen to see any stages of strategies emerging like this? I would have loved to see some exposition given towards analyzing the progression of strategies used by the LLMs (along with win rate progressions for defender vs. attacker and relative lengths of conversations, etc), but if there was a section in the paper that talked about this, I missed it. Any chance you would release something about this in the future? Obviously not expecting anything quite as involved as the OpenAI Hide and Seek video, but any data about that adversarial evolution could be really fascinating.
That said, the results are REALLY promising, and the results on the reasoning benchmarks are incredibly promising!
I wonder what other games could be implemented in this way...? Ideally there wouldn't be "cheap" strategies available, and I wonder if a co-op game would work for this -- perhaps something like Codenames: Duet?