iggyray / llms-planning

A benchmark for evaluating large language models in planning
0 stars 0 forks source link

LLMs and Planning

This repo utilises Kambhampati group's plan-bench to evaluate llms in planning. More info about plan-bench can be found in their paper titled "PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change".

Setup

  1. Install plan-bench dependencies. More here.
  2. Git clone & set up downward in ./planner_tools
  3. Git clone & set up VAL in ./planner_tools