irthomasthomas / undecidability

12 stars 2 forks source link

awesome-llm-planning-reasoning/README.md at main · samkhur006/awesome-llm-planning-reasoning #942

Closed ShellLM closed 3 days ago

ShellLM commented 3 days ago

awesome-llm-planning-reasoning

About

Welcome to the Awesome LLMs Planning Reasoning repository! This collection is dedicated to exploring the rapidly evolving field of Large Language Models (LLMs) and their capabilities in planning and reasoning.

Overview

As LLMs continue to demonstrate remarkable success in Natural Language Understanding (NLU) and Natural Language Generation (NLG), researchers are increasingly interested in assessing their abilities beyond traditional NLP tasks. One of the most promising and challenging areas of study is understanding how well LLMs can perform tasks that require planning and reasoning. These capabilities are essential for leveraging LLMs in more complex, real-world scenarios, such as autonomous decision-making, problem-solving, and strategic thinking. However, recent research suggests that LLMs often struggle with reasoning tasks that are relatively simple for most humans, highlighting the limitations of these models in this critical area.

This repository is a curated list of research papers, code repositories, and benchmarks that focus on the intersection of LLMs with planning and reasoning tasks. Here, you'll find:

Whether you're a researcher, developer, or enthusiast, this repository serves as a comprehensive resource for staying updated on the latest advancements and understanding the current challenges in the domain of LLMs' planning and reasoning abilities. Dive in and explore the fascinating world where language models meet high-level cognitive tasks!

Techniques

Paper Link Code Venue Date Other
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models arXiv -- NeurIPS 22 28 Jan 2022 Video
Self-Consistency Improves Chain of Thought Reasoning in Language Models arXiv -- ICLR 23 7 Mar 2023 Video
REACT: Synergizing Reasoning and Acting in Language Models arXiv GitHub ICLR 23 10 Mar 2023 Project Video
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models arXiv GitHub ICCV 23 30 Mar 2023 Project
Least-To-Most Prompting Enables Complex Reasoning In Large Language Models arXiv -- ICLR 23 16 Apr 2023
Chain-of-Symbol Prompting Elicits Planning in Large Language Models arXiv GitHub ICLR 24 17 May 2023
PlaSma: Procedural Knowledge Models for Language based Planning and Re-Planning arXiv GitHub ICLR 24 26 Jul 2023
Better Zero-Shot Reasoning with Role-Play Prompting arXiv GitHub NAACL 24 15 Aug 2023
LLM+P: Empowering Large Language Models with Optimal Planning Proficiency arXiv GitHub arXiv 27 Sep 2023
Reasoning with Language Model is Planning with World Model arXiv GitHub EMNLP 23 23 Oct 2023
Large Language Models as Commonsense Knowledge for Large-Scale Task Planning arXiv GitHub NeurIPS 23 30 Oct 2023 Project
PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization arXiv GitHub ICLR 24 7 Dec 2023
Tree of Thoughts: Deliberate Problem Solving with Large Language Models arXiv GitHub NeurIPS 23 3 Dec 2023 Video
Learning adaptive planning representations with natural language guidance arXiv -- arXiv 13 Dec 2023
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction arXiv GitHub ICLR 24 21 Dec 2023
Large Language Models can Learn Rules arXiv -- arXiv 24 Apr 2024
What's the Plan? Evaluating and Developing Planning-Aware Techniques for Language Models arXiv -- arXiv 22 May 2024
Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models arXiv GitHub arxiv 6 Jun 2024
Large Language Models Can Learn Temporal Reasoning arXiv GitHub ACL 24 11 Jun 2024
Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking arXiv GitHub arXiv 24 Jun 2024
Tree Search for Language Model Agents arXiv GitHub arXiv 1 Jul 2024 Project
Tree-Planner: Efficient Close-loop Task Planning with Large Language Models arXiv GitHub ICLR 24 24 Jul 2024 Project
RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning arXiv -- arXiv 6 Aug 2024
Automating Thought of Search: A Journey Towards Soundness and Completeness arXiv -- arXiv 21 Aug 2024

Reasoning Limitations

Paper Link Code Venue Date Other
Understanding the Capabilities of Large Language Models for Automated Planning arXiv -- arXiv 25 May 2023
Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond arXiv GitHub arXiv 8 Aug 2023
Evaluating Cognitive Maps and Planning in Large Language Models with CogEval arXiv GitHub NeurIPS 23 2 Nov 2023
On the Planning Abilities of Large Language Models : A Critical Investigation arXiv GitHub NeurIPS 23 6 Nov 2023
Large Language Models Cannot Self-Correct Reasoning Yet arXiv -- ICLR 24 14 Mar 2024
Dissociating language and thought in large language models arXiv -- Trends in Cognitive Sciences 23 Mar 2024
Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks arXiv GitHub NAACL 24 28 Mar 2024
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? arXiv -- arXiv 13 May 2024 Video
On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models arXiv -- arXiv 22 May 2024
Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models arXiv GitHub EACL 24 24 May 2023
Can Graph Learning Improve Task Planning? arXiv GitHub arXiv 29 May 2024
Graph-enhanced Large Language Models in Asynchronous Plan Reasoning arXiv GitHub ICML 24 3 Jun 2024
When is Tree Search Useful for LLM Planning? It Depends on the Discriminator arXiv GitHub ACL 24 6 Jun 2024
Chain of Thoughtlessness? An Analysis of CoT in Planning arXiv -- arXiv 6 Jun 2024
Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning arXiv GitHub ACL 24 7 Jun 2024
Can Language Models Serve as Text-Based World Simulators? arXiv GitHub ACL 24 10 Jun 2024
LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks arXiv -- ICML 24 12 Jun 2024 Video
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models arXiv GitHub arXiv 13 Jul 2024
On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks arXiv -- arXiv 3 Aug 2024
Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models arXiv -- arXiv 15 Aug 2024

Benchmarks

Paper Link Code Venue Date Other
Benchmarks for Automated Commonsense Reasoning: A Survey arXiv -- arXiv 22 Feb 2023
BioPlanner: Automatic Evaluation of LLMs on Protocol Planning in Biology arXiv GitHub EMN

Suggested labels

None