namin / llm-verified-with-monte-carlo-tree-search

LLM verified with Monte Carlo Tree Search
https://arxiv.org/abs/2402.08147
MIT License
209 stars 25 forks source link

Good Work! Any docs for code? #81

Open hsz0403 opened 1 week ago

hsz0403 commented 1 week ago

I see your codebase has some functions not mentioned in your paper, such as supporting Lean4 or DPO and PPO, do you have docs for Lean4 and all the scripts in the root file?

shenniger commented 1 week ago

Hi! Lean install instructions are in the Readme. In terms of usage, it shouldn't differ from the others. As for the other run_*.py scripts, the most important ones – including DPO/PPO – are documented here: https://github.com/namin/llm-verified-with-monte-carlo-tree-search?tab=readme-ov-file#execution

Happy to help or answer more questions!

hsz0403 commented 1 week ago

Thanks! I find an error that cmd doesn't output lean code when I run:python run_user.py --model_host openai --language Lean4 --problem_name problem_food

one of its output is:

"cmd"` : "import Mathlib\n\n\n\nThe provided Lean4 code is almost correct, but there are a few mistakes and missing pieces. Here is the corrected version:\n\n\n\n\nThis code does the following:\n\n1. It defines a datatype `Topping` with six constructors: `tomato`, `cheese`, `olive`, `broccoli`, `mushroom`, `pepper`.\n2. It defines a datatype `Food` with two constructors: `pasta` and `pizza`. Each of these constructors takes a list of `Topping`s as an argument.\n3. It defines a predicate `accepted` that takes a `Food` and returns a `Prop`. This predicate returns true if a `pasta` has two or fewer toppings, and a `pizza` has five or fewer toppings.\n4. It defines a lemma `accepted_pizza_of_more_than_three_toppings` that proves that if a food is `accepted` and has more than three toppings, it must be a pizza. This is done by case analysis on `f`. If `f` is a `pasta`, we get a contradiction because a `pasta` cannot have more than three toppings. If `f` is a `pizza`, we return the list of toppings, proving that `f` is a `pizza`.

and then it outputs:

{"messages":
 [{"severity": "error",
   "pos": {"line": 5, "column": 0},
   "endPos": {"line": 5, "column": 4},
   "data": "unexpected identifier; expected command"},
  {"severity": "error",
   "pos": {"line": 5, "column": 40},
   "endPos": {"line": 5, "column": 46},
   "data":
   "invalid 'import' command, it must be used in the beginning of the file"}],
 "env": 0}
SCORE
-1.0