In Large Language Models (LLMs), there have been consistent advancements intask-specific performance, largely influenced by effective prompt design. Whilerecent research on prompting has enhanced the reasoning capabilities of LLMs, agap remains in further improving their understanding abilities. In this study,we introduce metacognitive prompting (MP), a strategy inspired by humanintrospective reasoning processes. Using MP, LLMs undergo a systematic seriesof structured, self-aware evaluations, drawing on both their vast inherentknowledge and new insights. Our experiments involve five prevalent LLMs:Llama2, Vicuna, PaLM, GPT-3.5, and GPT-4, all of which span various generalnatural language understanding (NLU) tasks from the GLUE and SuperGLUEbenchmarks. Results indicate that, although GPT-4 consistently excels in mosttasks, PaLM, when equipped with MP, approaches its performance level.Furthermore, across models and datasets, MP consistently outperforms existingprompting methods, including standard and chain-of-thought prompting. Thisstudy underscores the potential to amplify the understanding abilities of LLMsand highlights the benefits of mirroring human introspective reasoning in NLUtasks.
URL
Affiliations
Abstract
Translation (by gpt-3.5-turbo)
Summary (by gpt-3.5-turbo)