lean-dojo / LeanDojo

Tool for data extraction and interacting with Lean programmatically.
https://leandojo.org
MIT License
536 stars 80 forks source link

Unable to trace Lean 4 repos #51

Closed AG161 closed 1 year ago

AG161 commented 1 year ago

Description I've updated to version 1.2.3 of LeanDojo, and I am now not able to trace two Lean 4 repos which I was able to trace in earlier versions, and I get different errors for each. The first repo is the Lean 4 example from the docs, the second one is a fork of the first.

Logs in Debug Mode

(dj) agittis@Andreass-MBP Dojo % echo $VERBOSE
1
(dj) agittis@Andreass-MBP Dojo % python
Python 3.9.6 (default, May  7 2023, 23:32:44) 
[Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lean_dojo import LeanGitRepo, trace
2023-09-05 12:30:42.361 | DEBUG    | lean_dojo.constants:<module>:63 - Using GitHub without authentication. Don't be surprised if you hit the API rate limit.
>>> repo = LeanGitRepo("https://github.com/yangky11/lean4-example", "7d711f6da4584ecb7d4f057715e1f72ba175c910")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 5, in __init__
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/lean.py", line 401, in __post_init__
    lean_version = get_lean4_commit_from_config(config)
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/lean.py", line 285, in get_lean4_commit_from_config
    latest_tag = next(LEAN4_NIGHTLY_REPO.get_tags())
TypeError: 'PaginatedList' object is not an iterator
>>> repo = LeanGitRepo("https://github.com/AG161/lean4-example", "48825b3")
2023-09-05 12:31:42.015 | DEBUG    | lean_dojo.data_extraction.lean:__post_init__:364 - Querying the commit hash for lean4-example 48825b3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 5, in __init__
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/lean.py", line 365, in __post_init__
    commit = _to_commit_hash(self.repo, self.commit)
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/lean.py", line 38, in _to_commit_hash
    raise ValueError(f"Invalid tag or branch: `{label}` for {repo}")
ValueError: Invalid tag or branch: `48825b3` for Repository(full_name="AG161/lean4-example")
>>> 

The second error indicates the branch of the repo is invalid, I double-checked and it does seem to be the correct commit. Also the same input resulted in a trace in a previous version of LeanDojo.

Platform Information

yangky11 commented 1 year ago

Hi,

Thanks for reporting them! The first one was a minor bug, and I just fixed it in 1.2.4. The second one is because we no longer support short commit hashes such as 48825b3. You can simply use its full commit hash.

AG161 commented 1 year ago

Thanks @yangky11. The first one worked, but the second one got stuck in the same place both times when I tried to run it. Here are the logs:

(dj) agittis@Andreass-MBP Dojo % echo $VERBOSE
1
(dj) agittis@Andreass-MBP Dojo % python
Python 3.9.6 (default, May  7 2023, 23:32:44) 
[Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lean_dojo import LeanGitRepo, trace
2023-09-06 16:07:39.175 | DEBUG    | lean_dojo.constants:<module>:70 - Using GitHub without authentication. Don't be surprised if you hit the API rate limit.
>>> repo2 = LeanGitRepo("https://github.com/AG161/lean4-example", "48825b33a1b8a61ad8da20852f53b621f76925d8")
>>> trace(repo2, dst_dir="traced_ast-example")
2023-09-06 16:08:22.701 | INFO     | lean_dojo.data_extraction.trace:get_traced_repo_path:131 - Tracing LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 16:08:22.706 | DEBUG    | lean_dojo.data_extraction.trace:get_traced_repo_path:133 - Working in the temporary directory /var/folders/wt/sdk361b95ps7x042zc_95zv40000gn/T/tmp8bmt1m2z
2023-09-06 16:08:23.321 | DEBUG    | lean_dojo.data_extraction.lean:clone_and_checkout:441 - Cloning LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 16:08:23.710 | DEBUG    | lean_dojo.data_extraction.trace:_trace_lean4:97 - Tracing LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 16:08:23.710 | DEBUG    | lean_dojo.container:run:305 - docker run --cidfile c4w8o1ga.cid --rm -u 501 --mount type=bind,src="/private/var/folders/wt/sdk361b95ps7x042zc_95zv40000gn/T/tmp8bmt1m2z/lean4-example",target="/workspace/lean4-example" --mount type=bind,src="/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/build_lean4_repo.py",target="/workspace/build_lean4_repo.py" --mount type=bind,src="/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/ExtractData.lean",target="/workspace/lean4-example/ExtractData.lean" --env NUM_PROCS=8 --workdir /workspace yangky11/lean-dojo python3 build_lean4_repo.py lean4-example
2023-09-06 20:08:24.428 | INFO     | __main__:main:121 - Building lean4-example
info: syncing channel updates for 'nightly'
info: latest update on nightly, lean version nightly-2023-09-05
info: downloading component 'lean'
info: installing component 'lean'
warning: improperly formatted manifest: incompatible manifest version `4`
[0/4] Building Lean4Example
[1/4] Compiling Lean4Example
[1/4] Building Main
[2/4] Compiling Main
[4/4] Linking lean4-example
2023-09-06 20:09:48.092 | INFO     | __main__:main:154 - Tracing lean4-example
 86%|████████▌ | 620/719 [1:49:53<51:11, 31.03s/it] 

I'm not sure what the improperly formatted manifest refers to.

Also sorry for the delay, it takes a few hours each time I run it.

yangky11 commented 1 year ago

I tried on both Linux and Mac but wasn't able to reproduce this error.

My output on Linux:

In [1]: from lean_dojo import LeanGitRepo, trace
2023-09-06 15:59:04.661 | DEBUG    | lean_dojo.constants:<module>:66 - Using GitHub personal access token for authentication

In [2]:  repo2 = LeanGitRepo("https://github.com/AG161/lean4-example", "48825b33
   ...: a1b8a61ad8da20852f53b621f76925d8")

In [3]: trace(repo2, dst_dir="traced_ast-example")
2023-09-06 15:59:43.323 | INFO     | lean_dojo.data_extraction.trace:get_traced_repo_path:131 - Tracing LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 15:59:43.324 | DEBUG    | lean_dojo.data_extraction.trace:get_traced_repo_path:133 - Working in the temporary directory /raid/kaiyu/tmp/tmpwx4v37yi
2023-09-06 15:59:43.819 | DEBUG    | lean_dojo.data_extraction.lean:clone_and_checkout:441 - Cloning LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 15:59:44.235 | DEBUG    | lean_dojo.data_extraction.trace:_trace_lean4:97 - Tracing LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 15:59:44.236 | DEBUG    | lean_dojo.container:run:305 - docker run --cidfile qw0e5edu.cid --rm -u 1013 --mount type=bind,src="/raid/kaiyu/tmp/tmpwx4v37yi/lean4-example",target="/workspace/lean4-example" --mount type=bind,src="/home/kaiyu/miniconda3/envs/lean-dojo/lib/python3.10/site-packages/lean_dojo/data_extraction/build_lean4_repo.py",target="/workspace/build_lean4_repo.py" --mount type=bind,src="/home/kaiyu/miniconda3/envs/lean-dojo/lib/python3.10/site-packages/lean_dojo/data_extraction/ExtractData.lean",target="/workspace/lean4-example/ExtractData.lean" --env NUM_PROCS=32 --workdir /workspace yangky11/lean-dojo python3 build_lean4_repo.py lean4-example
2023-09-06 22:59:44.685 | INFO     | __main__:main:121 - Building lean4-example
info: syncing channel updates for 'nightly'
info: latest update on nightly, lean version nightly-2023-09-05
info: downloading component 'lean'
info: installing component 'lean'
warning: improperly formatted manifest: incompatible manifest version `4`
[0/4] Building Lean4Example
[1/4] Compiling Lean4Example
[1/4] Building Main
[3/4] Compiling Main
[4/4] Linking lean4-example
2023-09-06 22:59:53.584 | INFO     | __main__:main:154 - Tracing lean4-example
 86%|████████▌ | 620/719 [03:05<02:10,  1.32s/it]2023-09-06 16:03:05.005 | DEBUG    | lean_dojo.data_extraction.traced_data:from_traced_files:1388 - Parsing 621 *.ast.json files in /raid/kaiyu/tmp/tmpwx4v37yi/lean4-example with 31 workers
2023-09-06 16:03:07,050 INFO worker.py:1612 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
  0%|                                                   | 0/621 [00:00<?, ?it/s](pid=3107909) 2023-09-06 16:03:09.319 | DEBUG    | lean_dojo.constants:<module>:66 - Using GitHub personal access token for authentication
100%|█████████████████████████████████████████| 621/621 [00:24<00:00, 25.63it/s]
(pid=3107926) 2023-09-06 16:03:09.456 | DEBUG    | lean_dojo.constants:<module>:66 - Using GitHub personal access token for authentication [repeated 30x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/ray-logging.html#log-deduplication for more options.)
2023-09-06 16:03:35.196 | DEBUG    | lean_dojo.data_extraction.lean:_get_lean4_dependencies:522 - Querying the dependencies of LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 16:03:37.410 | DEBUG    | lean_dojo.data_extraction.traced_data:save_to_disk:1435 - Saving 621 traced XML files to /raid/kaiyu/tmp/tmpwx4v37yi/lean4-example with 31 workers
2023-09-06 16:03:39,390 INFO worker.py:1612 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
  0%|                                                   | 0/621 [00:00<?, ?it/s](pid=3121748) 2023-09-06 16:03:41.399 | DEBUG    | lean_dojo.constants:<module>:66 - Using GitHub personal access token for authentication
100%|█████████████████████████████████████████| 621/621 [00:10<00:00, 56.48it/s]
(pid=3121758) 2023-09-06 16:03:41.616 | DEBUG    | lean_dojo.constants:<module>:66 - Using GitHub personal access token for authentication [repeated 30x across cluster]
2023-09-06 16:03:55.870 | INFO     | lean_dojo.data_extraction.trace:trace:163 - Loading the traced repo from /home/kaiyu/.cache/lean_dojo/AG161-lean4-example-48825b33a1b8a61ad8da20852f53b621f76925d8/lean4-example
2023-09-06 16:03:56.112 | DEBUG    | lean_dojo.data_extraction.traced_data:load_from_disk:1462 - Loading 621 traced XML files from /raid/kaiyu/.cache/lean_dojo/AG161-lean4-example-48825b33a1b8a61ad8da20852f53b621f76925d8/lean4-example with 31 workers
2023-09-06 16:03:58,100 INFO worker.py:1612 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
  0%|                                                   | 0/621 [00:00<?, ?it/s](pid=3133966) 2023-09-06 16:04:00.163 | DEBUG    | lean_dojo.constants:<module>:66 - Using GitHub personal access token for authentication
100%|█████████████████████████████████████████| 621/621 [00:34<00:00, 18.08it/s]
(pid=3133949) 2023-09-06 16:04:00.455 | DEBUG    | lean_dojo.constants:<module>:66 - Using GitHub personal access token for authentication [repeated 30x across cluster]
2023-09-06 16:04:35.748 | DEBUG    | lean_dojo.data_extraction.lean:_get_lean4_dependencies:522 - Querying the dependencies of LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 16:04:38.835 | DEBUG    | lean_dojo.data_extraction.traced_data:check_sanity:1326 - Checking the sanity of TracedRepo(repo=LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8'), dependencies={'lean4': LeanGitRepo(url='https://github.com/leanprover/lean4', commit='84bf315ac89be67b13913fdb7a42c9e859a7638b')}, root_dir=PosixPath('/raid/kaiyu/.cache/lean_dojo/AG161-lean4-example-48825b33a1b8a61ad8da20852f53b621f76925d8/lean4-example'))
Out[3]: TracedRepo(repo=LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8'), dependencies={'lean4': LeanGitRepo(url='https://github.com/leanprover/lean4', commit='84bf315ac89be67b13913fdb7a42c9e859a7638b')}, root_dir=PosixPath('/raid/kaiyu/.cache/lean_dojo/AG161-lean4-example-48825b33a1b8a61ad8da20852f53b621f76925d8/lean4-example'))

My output on Mac:

In [5]: trace(repo2, dst_dir="traced_ast-example")
2023-09-06 20:24:53.306 | INFO     | lean_dojo.data_extraction.trace:get_traced_repo_path:131 - Tracing LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 20:24:53.309 | DEBUG    | lean_dojo.data_extraction.trace:get_traced_repo_path:133 - Working in the temporary directory /var/folders/kx/4n7dwsnx2gj9sxmyn5h4c93m0000gn/T/tmpt9ol6acs
2023-09-06 20:24:53.864 | DEBUG    | lean_dojo.data_extraction.lean:clone_and_checkout:441 - Cloning LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 20:24:54.919 | DEBUG    | lean_dojo.data_extraction.trace:_trace_lean4:97 - Tracing LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 20:24:54.937 | DEBUG    | lean_dojo.container:run:183 - NUM_PROCS=8 python3 build_lean4_repo.py lean4-example
2023-09-06 20:24:55.809 | INFO     | __main__:main:121 - Building lean4-example
Building Lean4Example
Compiling Lean4Example
Building Main
Compiling Main
Linking lean4-example
2023-09-06 20:25:07.009 | INFO     | __main__:main:154 - Tracing lean4-example
 86%|███████████████████████████████████▎     | 619/719 [32:06<09:36,  5.76s/it]/Users/kaiyuy/miniconda3/envs/lean-dojo/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
2023-09-06 20:57:30.604 | DEBUG    | lean_dojo.data_extraction.traced_data:from_traced_files:1388 - Parsing 620 *.ast.json files in /private/var/folders/kx/4n7dwsnx2gj9sxmyn5h4c93m0000gn/T/tmpt9ol6acs/lean4-example with 7 workers
2023-09-06 20:57:42,778 INFO worker.py:1612 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
  0%|                                                   | 0/620 [00:00<?, ?it/s](pid=66219) 2023-09-06 20:57:56.977 | DEBUG    | lean_dojo.constants:<module>:70 - Using GitHub without authentication. Don't be surprised if you hit the API rate limit.
100%|█████████████████████████████████████████| 620/620 [00:43<00:00, 14.11it/s]
(pid=66221) 2023-09-06 20:57:56.985 | DEBUG    | lean_dojo.constants:<module>:70 - Using GitHub without authentication. Don't be surprised if you hit the API rate limit. [repeated 6x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/ray-logging.html#log-deduplication for more options.)
2023-09-06 20:58:40.853 | DEBUG    | lean_dojo.data_extraction.lean:_get_lean4_dependencies:522 - Querying the dependencies of LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 20:58:44.251 | DEBUG    | lean_dojo.data_extraction.traced_data:save_to_disk:1435 - Saving 620 traced XML files to /private/var/folders/kx/4n7dwsnx2gj9sxmyn5h4c93m0000gn/T/tmpt9ol6acs/lean4-example with 7 workers
2023-09-06 20:58:47,473 INFO worker.py:1612 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
  0%|                                                   | 0/620 [00:00<?, ?it/s](pid=66289) 2023-09-06 20:58:51.308 | DEBUG    | lean_dojo.constants:<module>:70 - Using GitHub without authentication. Don't be surprised if you hit the API rate limit.
100%|█████████████████████████████████████████| 620/620 [00:16<00:00, 37.86it/s]
(pid=66290) 2023-09-06 20:58:51.489 | DEBUG    | lean_dojo.constants:<module>:70 - Using GitHub without authentication. Don't be surprised if you hit the API rate limit. [repeated 6x across cluster]
2023-09-06 20:59:15.899 | INFO     | lean_dojo.data_extraction.trace:trace:163 - Loading the traced repo from /Users/kaiyuy/.cache/lean_dojo/AG161-lean4-example-48825b33a1b8a61ad8da20852f53b621f76925d8/lean4-example
2023-09-06 20:59:16.170 | DEBUG    | lean_dojo.data_extraction.traced_data:load_from_disk:1462 - Loading 620 traced XML files from /Users/kaiyuy/.cache/lean_dojo/AG161-lean4-example-48825b33a1b8a61ad8da20852f53b621f76925d8/lean4-example with 7 workers
2023-09-06 20:59:21,194 INFO worker.py:1612 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 
  0%|                                                   | 0/620 [00:00<?, ?it/s](pid=66362) 2023-09-06 20:59:25.267 | DEBUG    | lean_dojo.constants:<module>:70 - Using GitHub without authentication. Don't be surprised if you hit the API rate limit.
100%|█████████████████████████████████████████| 620/620 [00:49<00:00, 12.63it/s]
(pid=66360) 2023-09-06 20:59:25.329 | DEBUG    | lean_dojo.constants:<module>:70 - Using GitHub without authentication. Don't be surprised if you hit the API rate limit. [repeated 6x across cluster]
2023-09-06 21:00:14.104 | DEBUG    | lean_dojo.data_extraction.lean:_get_lean4_dependencies:522 - Querying the dependencies of LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-06 21:00:16.178 | DEBUG    | lean_dojo.data_extraction.traced_data:check_sanity:1326 - Checking the sanity of TracedRepo(repo=LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8'), dependencies={'lean4': LeanGitRepo(url='https://github.com/leanprover/lean4', commit='84bf315ac89be67b13913fdb7a42c9e859a7638b')}, root_dir=PosixPath('/Users/kaiyuy/.cache/lean_dojo/AG161-lean4-example-48825b33a1b8a61ad8da20852f53b621f76925d8/lean4-example'))
Out[5]: TracedRepo(repo=LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8'), dependencies={'lean4': LeanGitRepo(url='https://github.com/leanprover/lean4', commit='84bf315ac89be67b13913fdb7a42c9e859a7638b')}, root_dir=PosixPath('/Users/kaiyuy/.cache/lean_dojo/AG161-lean4-example-48825b33a1b8a61ad8da20852f53b621f76925d8/lean4-example'))

Note that 620/719 is not sticking in the middle but actually the last step. 719 is the total number of .lean files. 620 is the number of files processed. Not all .lean files have to be processed.

Are you able to locate the exact location (in the source code) where the program is stuck? E.g., if you use Ctrl+C to stop the program, does it print a KeyboardInterrupt with stack trace?

AG161 commented 1 year ago

Here are logs with the keyboard interrupt:

(dj) agittis@Andreass-MBP Dojo % python
Python 3.9.6 (default, May  7 2023, 23:32:44) 
[Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lean_dojo import LeanGitRepo, trace
2023-09-07 11:31:33.831 | DEBUG    | lean_dojo.constants:<module>:70 - Using GitHub without authentication. Don't be surprised if you hit the API rate limit.
>>> repo2 = LeanGitRepo("https://github.com/AG161/lean4-example", "48825b33a1b8a61ad8da20852f53b621f76925d8")
>>> trace(repo2, dst_dir="traced_ast-example")
2023-09-07 11:31:51.021 | INFO     | lean_dojo.data_extraction.trace:get_traced_repo_path:131 - Tracing LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-07 11:31:51.025 | DEBUG    | lean_dojo.data_extraction.trace:get_traced_repo_path:133 - Working in the temporary directory /var/folders/wt/sdk361b95ps7x042zc_95zv40000gn/T/tmpqax5uou3
2023-09-07 11:31:51.533 | DEBUG    | lean_dojo.data_extraction.lean:clone_and_checkout:441 - Cloning LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-07 11:31:52.281 | DEBUG    | lean_dojo.data_extraction.trace:_trace_lean4:97 - Tracing LeanGitRepo(url='https://github.com/AG161/lean4-example', commit='48825b33a1b8a61ad8da20852f53b621f76925d8')
2023-09-07 11:31:52.282 | DEBUG    | lean_dojo.container:run:305 - docker run --cidfile fyxbcd80.cid --rm -u 501 --mount type=bind,src="/private/var/folders/wt/sdk361b95ps7x042zc_95zv40000gn/T/tmpqax5uou3/lean4-example",target="/workspace/lean4-example" --mount type=bind,src="/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/build_lean4_repo.py",target="/workspace/build_lean4_repo.py" --mount type=bind,src="/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/ExtractData.lean",target="/workspace/lean4-example/ExtractData.lean" --env NUM_PROCS=8 --workdir /workspace yangky11/lean-dojo python3 build_lean4_repo.py lean4-example
2023-09-07 15:31:53.021 | INFO     | __main__:main:121 - Building lean4-example
info: syncing channel updates for 'nightly'
info: latest update on nightly, lean version nightly-2023-09-07
info: downloading component 'lean'
info: installing component 'lean'
warning: improperly formatted manifest: incompatible manifest version `4`
[0/4] Building Lean4Example
[1/4] Compiling Lean4Example
[1/4] Building Main
[2/4] Compiling Main
[4/4] Linking lean4-example
2023-09-07 15:33:16.527 | INFO     | __main__:main:154 - Tracing lean4-example
 86%|████████▌ | 620/719 [1:41:12<46:53, 28.42s/it]^CTraceback (most recent call last):
  File "/usr/lib64/python3.11/subprocess.py", line 1205, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/subprocess.py", line 2057, in _communicate
    ready = selector.select(timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/workspace/build_lean4_repo.py", line 165, in <module>
    main()
  File "/workspace/build_lean4_repo.py", line 156, in main
    run_cmd(
  File "/workspace/build_lean4_repo.py", line 27, in run_cmd
    res = subprocess.run(cmd, shell=True, capture_output=capture_output, check=True)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/subprocess.py", line 548, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/subprocess.py", line 1216, in communicate
    self._wait(timeout=sigint_timeout)
  File "/usr/lib64/python3.11/subprocess.py", line 1989, in _wait
    time.sleep(delay)
KeyboardInterrupt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/trace.py", line 162, in trace
    cached_path = get_traced_repo_path(repo)
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/trace.py", line 134, in get_traced_repo_path
    _trace(repo)
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/trace.py", line 66, in _trace
    _trace_lean4(repo)
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/trace.py", line 104, in _trace_lean4
    container.run(
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/container.py", line 315, in run
    execute(cmd, capture_output=capture_output)
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/utils.py", line 112, in execute
    res = subprocess.run(cmd, shell=True, capture_output=capture_output, check=True)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/subprocess.py", line 507, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/subprocess.py", line 1126, in communicate
    self.wait()
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/subprocess.py", line 1189, in wait
    return self._wait(timeout=timeout)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/subprocess.py", line 1917, in _wait
    (pid, sts) = self._try_wait(0)
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/subprocess.py", line 1875, in _try_wait
    (pid, sts) = os.waitpid(self.pid, wait_flags)
  File "/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/container.py", line 310, in _exit_gracefully
    raise RuntimeError(f"Failed to execute {cmd}")
RuntimeError: Failed to execute docker run --cidfile fyxbcd80.cid --rm -u 501 --mount type=bind,src="/private/var/folders/wt/sdk361b95ps7x042zc_95zv40000gn/T/tmpqax5uou3/lean4-example",target="/workspace/lean4-example" --mount type=bind,src="/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/build_lean4_repo.py",target="/workspace/build_lean4_repo.py" --mount type=bind,src="/Users/agittis/Lean/Dojo/dj/lib/python3.9/site-packages/lean_dojo/data_extraction/ExtractData.lean",target="/workspace/lean4-example/ExtractData.lean" --env NUM_PROCS=8 --workdir /workspace yangky11/lean-dojo python3 build_lean4_repo.py lean4-example
>>> 

I'm able to trace both repos on linux, and this doesn't seem to be the same as the original issue, so we can close this one if you like. I'm also happy to keep trying things. I think I'll probably switch to linux since it's faster anyway.

yangky11 commented 1 year ago

Sure, let's close this for now. Feel free to re-open the issue if it causes problem later.