lean-dojo / LeanDojo

Tool for data extraction and interacting with Lean programmatically.
https://leandojo.org
MIT License
532 stars 77 forks source link

Leandojo didn't trace newly created .lean file #91

Closed chenyang-an closed 9 months ago

chenyang-an commented 9 months ago

I'm using the trace function of Leandojo. The code is as follows:

from lean_dojo import LeanGitRepo, trace

repo = LeanGitRepo("https://github.com/yangky11/lean4-example", "7d711f6da4584ecb7d4f057715e1f72ba175c910")
trace(repo, dst_dir="traced_lean4-example")

This works just fine.

However, then I created a new .lean file in the directory of lean4-example and then uploaded everything to Github. Leandojo didn't trace the new .lean file for me. Locally on my own machine, every theorems in the newly created .lean can be compiled by the Lean4 compiler.

I'm not sure why is this problem happening. I do notice that in build/ir directory there is no .trace or olean file for my newly created .lean file.

Do you know why this is happening? Thanks!

yangky11 commented 9 months ago

Hi,

Did you push the new file to a GitHub repo? Can you point me to the repo?

chenyang-an commented 9 months ago

The GitHub link is https://github.com/chenyang-an/lean4_example_test with commit number 0454bbe5855cfbebf840db2868f8d10cab0d6a2e.

I directly use the Lean 4 toy example code and one more .lean file to it.

Here is the tracing code: from lean_dojo import LeanGitRepo, trace

repo = LeanGitRepo("https://github.com/chenyang-an/lean4_example_test","0454bbe5855cfbebf840db2868f8d10cab0d6a2e")
traced_repo = trace(repo)
data = []

print(traced_repo.num_traced_files)
print(traced_repo.get_traced_file('Lean4Example.lean').get_traced_theorems())

print(traced_repo.get_traced_file('Lean4Example_1.lean').get_traced_theorems())

Here is the error

python load_data.py
、2023-11-04 16:13:54.894 | INFO     | lean_dojo.data_extraction.trace:trace:174 - Loading the traced repo from /home/chenyang/.cache/lean_dojo/chenyang-an-lean4_example_test-0454bbe5855cfbebf840db2868f8d10cab0d6a2e/lean4_example_test
2023-11-04 16:13:57,076 INFO worker.py:1633 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265
 30%|████████████▏                            | 184/620 [00:08<00:13, 31.41it/s](raylet) [2023-11-04 16:14:06,983 E 1959119 1959134] (raylet) file_system_monitor.cc:111: /tmp/ray/session_2023-11-04_16-13-55_357538_1958868 is over 95% full, available space: 337906065408; capacity: 15228501098496. Object creation will fail if spilling is required.
 75%|██████████████████████████████▌          | 462/620 [00:16<00:02, 57.96it/s](raylet) [2023-11-04 16:14:16,988 E 1959119 1959134] (raylet) file_system_monitor.cc:111: /tmp/ray/session_2023-11-04_16-13-55_357538_1958868 is over 95% full, available space: 337906036736; capacity: 15228501098496. Object creation will fail if spilling is required.
100%|█████████████████████████████████████████| 620/620 [00:23<00:00, 25.95it/s]
620
[TracedTheorem(theorem=Theorem(repo=LeanGitRepo(url='https://github.com/chenyang-an/lean4_example_test', commit='0454bbe5855cfbebf840db2868f8d10cab0d6a2e'), file_path=PosixPath('Lean4Example.lean'), full_name='hello_world')), TracedTheorem(theorem=Theorem(repo=LeanGitRepo(url='https://github.com/chenyang-an/lean4_example_test', commit='0454bbe5855cfbebf840db2868f8d10cab0d6a2e'), file_path=PosixPath('Lean4Example.lean'), full_name='foo'))]
Traceback (most recent call last):
  File "/home/chenyang/leandojo_project/proplogic/load_data.py", line 11, in <module>
    print(traced_repo.get_traced_file('Lean4Example_1.lean').get_traced_theorems())
  File "/data/chenyang/anaconda3/envs/atpself/lib/python3.10/site-packages/lean_dojo/data_extraction/traced_data.py", line 1474, in get_traced_file
    return self.traced_files_graph.nodes[str(path)]["traced_file"]
  File "/data/chenyang/anaconda3/envs/atpself/lib/python3.10/site-packages/networkx/classes/reportviews.py", line 194, in __getitem__
    return self._nodes[n]
KeyError: 'Lean4Example_1.lean'

Thanks!

yangky11 commented 9 months ago

Although Lean4Example_1.lean is in the repo, it is not actually used by the project. For example, if you run lake build, you won't find it in the "./build" directory.

chenyang-an commented 9 months ago

I see. But why is the file Lean4Example_1.lean not able to be traced by leandojo? ./build is not uploaded so I guess it's not the problem. I ask because I have a very large amount of theorems, and if I put them all in one .lean file, the trace operation (the building process) would also fail. So I have to split them into different files. But these files cannot be traced, hence the question.

yangky11 commented 9 months ago

LeanDojo only traces files that are actually used

chenyang-an commented 9 months ago

I see. Thanks!