namin / llm-verified-with-monte-carlo-tree-search

LLM verified with Monte Carlo Tree Search
https://arxiv.org/abs/2402.08147
MIT License
215 stars 26 forks source link

Add support for Lean #5

Closed johnjyang closed 8 months ago

johnjyang commented 8 months ago

Added support for Lean4:

namin commented 8 months ago

Thanks so much for this initiative. This is exciting :)

I get the following error when trying this in Lean 4.2.0.

$ python run.py
/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/trl/trainer/ppo_config.py:141: UserWarning: The `optimize_cuda_cache` arguement will be deprecated soon, please use `optimize_device_cache` instead.
  warnings.warn(
/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:472: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
Loading checkpoint shards: 100%|██████████████████| 7/7 [00:48<00:00,  6.86s/it]
/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/transformers/utils/hub.py:374: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
TEXT
### Spec: In Lean4, write a factorial function and prove that the factorial is always strictly positive.
### Lean4 example: ```lean
open Nat (add_assoc add_comm)

theorem hello_world (a b c : Nat)
  : a + b + c = a + c + b := by
  rw [add_assoc, add_comm b, ←add_assoc]

theorem foo (a : Nat) : a + 1 = Nat.succ a := by rfl

### Important: Do not import any external packages. You may only use the Lean 4 standard library.
### Lean4: Factorial function

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
    - Avoid using `tokenizers` before the fork if possible
    - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
{ "cmd" : "open Nat (add_assoc add_comm)\n\ntheorem hello_world (a b c : Nat)\n  : a + b + c = a + c + b := by\n  rw [add_assoc, add_comm b, ←add_assoc]\n\ntheorem foo (a : Nat) : a + 1 = Nat.succ a := by rfl" }
Traceback (most recent call last):
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/run.py", line 41, in <module>
    montecarlo.simulate(expansion_count)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/montecarlo/montecarlo.py", line 51, in simulate
    self.expand(current_node)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/montecarlo/montecarlo.py", line 54, in expand
    self.child_finder(node, self)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/run.py", line 26, in child_finder
    text = generate_complete(node.state, montecarlo)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/run.py", line 14, in generate_complete
    score = score_func(text)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/lean.py", line 47, in score_func
    score = calculateScore(sentence)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/lean.py", line 27, in calculateScore
    score, _ = calculateScoreHelper(msg)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/lean.py", line 35, in calculateScoreHelper
    r = checkLean(v)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/lean.py", line 61, in checkLean
    out = proofsearch.run_code(lean_code_block.strip(), verbose=True)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/pySagredo/proofsearch.py", line 36, in run_code
    index = self.proc.expect('env": \d+\}', timeout=20)
  File "/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/pexpect/spawnbase.py", line 343, in expect
    return self.expect_list(compiled_pattern_list,
  File "/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/pexpect/spawnbase.py", line 372, in expect_list
    return exp.expect_loop(timeout)
  File "/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/pexpect/expect.py", line 179, in expect_loop
    return self.eof(e)
  File "/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/pexpect/expect.py", line 122, in eof
    raise exc
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
<pexpect.pty_spawn.spawn object at 0x7fc13d2e1720>
command: /home/namin/local/lean/bin/lake
args: [b'/home/namin/local/lean/bin/lake', b'env', b'lean', b'--run', b'REPL/Main.lean']
buffer (last 100 chars): ''
before (last 100 chars): "270:2: error: unknown constant 'sorryAx'\r\nREPL/Main.lean:269:47: error: unknown constant 'sorryAx'\r\n"
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 3646976
child_fd: 73
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
    0: re.compile('env": \\d+\\}')

Would you know what the problem is?

I had to add pexpect in requirements.txt.

Thanks, again :)

namin commented 8 months ago

Also, are you doing custom modifications to the libraries you've added? If not, would it make sense to have them as git submodules instead so that they are easier to keep maintained? If yes, it would be helpful to understand what the customizations are.

(For example, for the montecarlo library, I made this clear in the history.)

Thanks so much!

johnjyang commented 8 months ago

Thanks so much for this initiative. This is exciting :)

I get the following error when trying this in Lean 4.2.0.

$ python run.py
/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/trl/trainer/ppo_config.py:141: UserWarning: The `optimize_cuda_cache` arguement will be deprecated soon, please use `optimize_device_cache` instead.
  warnings.warn(
/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:472: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
Loading checkpoint shards: 100%|██████████████████| 7/7 [00:48<00:00,  6.86s/it]
/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/transformers/utils/hub.py:374: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
  warnings.warn(
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
TEXT
### Spec: In Lean4, write a factorial function and prove that the factorial is always strictly positive.
### Lean4 example: ```lean
open Nat (add_assoc add_comm)

theorem hello_world (a b c : Nat)
  : a + b + c = a + c + b := by
  rw [add_assoc, add_comm b, ←add_assoc]

theorem foo (a : Nat) : a + 1 = Nat.succ a := by rfl

### Important: Do not import any external packages. You may only use the Lean 4 standard library.
### Lean4: Factorial function

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
  - Avoid using `tokenizers` before the fork if possible
  - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
{ "cmd" : "open Nat (add_assoc add_comm)\n\ntheorem hello_world (a b c : Nat)\n  : a + b + c = a + c + b := by\n  rw [add_assoc, add_comm b, ←add_assoc]\n\ntheorem foo (a : Nat) : a + 1 = Nat.succ a := by rfl" }
Traceback (most recent call last):
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/run.py", line 41, in <module>
    montecarlo.simulate(expansion_count)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/montecarlo/montecarlo.py", line 51, in simulate
    self.expand(current_node)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/montecarlo/montecarlo.py", line 54, in expand
    self.child_finder(node, self)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/run.py", line 26, in child_finder
    text = generate_complete(node.state, montecarlo)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/run.py", line 14, in generate_complete
    score = score_func(text)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/lean.py", line 47, in score_func
    score = calculateScore(sentence)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/lean.py", line 27, in calculateScore
    score, _ = calculateScoreHelper(msg)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/lean.py", line 35, in calculateScoreHelper
    r = checkLean(v)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/lean.py", line 61, in checkLean
    out = proofsearch.run_code(lean_code_block.strip(), verbose=True)
  File "/home/namin/ai-tree-search/llm-verified-with-monte-carlo-tree-search/pySagredo/proofsearch.py", line 36, in run_code
    index = self.proc.expect('env": \d+\}', timeout=20)
  File "/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/pexpect/spawnbase.py", line 343, in expect
    return self.expect_list(compiled_pattern_list,
  File "/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/pexpect/spawnbase.py", line 372, in expect_list
    return exp.expect_loop(timeout)
  File "/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/pexpect/expect.py", line 179, in expect_loop
    return self.eof(e)
  File "/home/namin/mambaforge/envs/trl/lib/python3.10/site-packages/pexpect/expect.py", line 122, in eof
    raise exc
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
<pexpect.pty_spawn.spawn object at 0x7fc13d2e1720>
command: /home/namin/local/lean/bin/lake
args: [b'/home/namin/local/lean/bin/lake', b'env', b'lean', b'--run', b'REPL/Main.lean']
buffer (last 100 chars): ''
before (last 100 chars): "270:2: error: unknown constant 'sorryAx'\r\nREPL/Main.lean:269:47: error: unknown constant 'sorryAx'\r\n"
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 3646976
child_fd: 73
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
    0: re.compile('env": \\d+\\}')

Would you know what the problem is?

I had to add pexpect in requirements.txt.

Thanks, again :)

Thanks for checking the code so quickly!

I didn't commit the /build directory in repl from running lake build. Repl is now added as a git submodule, and I updated README with instructions for setting it up. Apologies for the oversight!

Could you try again by running lake build in /repl and see if it runs now?

johnjyang commented 8 months ago

Also, are you doing custom modifications to the libraries you've added? If not, would it make sense to have them as git submodules instead so that they are easier to keep maintained? If yes, it would be helpful to understand what the customizations are.

(For example, for the montecarlo library, I made this clear in the history.)

Thanks so much!

Just added /repl as a git submodule. For pySagredo, I kept only the proofsearch.py file for interacting with /repl in python.

Would you recommend me clarifying that in the README?

Thank you!

namin commented 8 months ago

Thanks for the quick reaction.

It looks like the setup now works!

I don't think the submodule is set up correctly. You need to delete the repl directory and do:

git submodule add repl REPL-URL

Then in the Setup of the README, you can give instructions for that by adding:

git submodule init
git submodule update

This line in the README can be changed: "The repl library is adapted from leanprover-community/repl." to "The leanprover-community/repl library is used as a submodule."

Finally, as I am trying it out, I am wondering about your customizations in prompts.py. It seems it makes the initial code a solution all by itself for the factorial problem. See below. Do you want to have a proof marker?

CHOSEN SOLUTION
### Spec: In Lean4, write a factorial function and prove that the factorial is always strictly positive.
### Lean4 example: ```lean
open Nat (add_assoc add_comm)

theorem hello_world (a b c : Nat)
  : a + b + c = a + c + b := by
  rw [add_assoc, add_comm b, ←add_assoc]

theorem foo (a : Nat) : a + 1 = Nat.succ a := by rfl

### Important: Do not import any external packages. You may only use the Lean 4 standard library.
### Lean4: Factorial Function and Positivity Proof

Thanks, again.

johnjyang commented 8 months ago

Thanks for the quick reaction.

It looks like the setup now works!

I don't think the submodule is set up correctly. You need to delete the repl directory and do:

git submodule add repl REPL-URL

Then in the Setup of the README, you can give instructions for that by adding:

git submodule init
git submodule update

This line in the README can be changed: "The repl library is adapted from leanprover-community/repl." to "The leanprover-community/repl library is used as a submodule."

Finally, as I am trying it out, I am wondering about your customizations in prompts.py. It seems it makes the initial code a solution all by itself for the factorial problem. See below. Do you want to have a proof marker?

CHOSEN SOLUTION
### Spec: In Lean4, write a factorial function and prove that the factorial is always strictly positive.
### Lean4 example: ```lean
open Nat (add_assoc add_comm)

theorem hello_world (a b c : Nat)
  : a + b + c = a + c + b := by
  rw [add_assoc, add_comm b, ←add_assoc]

theorem foo (a : Nat) : a + 1 = Nat.succ a := by rfl

### Important: Do not import any external packages. You may only use the Lean 4 standard library.
### Lean4: Factorial Function and Positivity Proof

Thanks, again.

Great! Thank you for the help. I think repl is now a submodule confirmed. I also updated the README.

For the prompt customization, I was trying to include an example so that the model is more likely to write Lean 4 than Lean 3. This was an issue for GPT-4. I have removed the example for now, we can always add it back later if necessary.

namin commented 8 months ago

Cool, I'll check out your latest changes. A quick thought: do you want to add a proof marker to make sure Lean4 produces a proof in the solution?

johnjyang commented 8 months ago

Cool, I'll check out your latest changes. A quick thought: do you want to add a proof marker to make sure Lean4 produces a proof in the solution?

I believe that Lean 4 does not have a specialized marker for denoting end of proofs. We can potentially edit the prompt to enforce adding a "done" to the last line (but I wouldn't trust the model every time)? See this thread

namin commented 8 months ago

Maybe the marker can be by? It doesn't have to be the end, just present.

I tried the system now, and the open models are really struggling. Could be a lean3 vs lean4 thing. I guess we can tweak the prompts after merging.

Let me take a final look. Thanks.

namin commented 8 months ago

Thank you, again :)