lean-dojo / LeanDojo

Tool for data extraction and interacting with Lean programmatically.
https://leandojo.org
MIT License
552 stars 83 forks source link

"Missing $PATH/Main_dep.paths" during tracing of repo #171

Closed realharryhero closed 3 months ago

realharryhero commented 3 months ago

Description When I try to trace my git repo here, it works well until a warning appears stating a dependency path is missing:

2024-06-21 23:10:08.917 | WARNING | __main__:check_files:132 - Missing /tmp/tmp_pa6ep1e/workspace/LeanSmallTheoremRepo/.lake/build/ir/Main.dep_paths

and then later on, an error occurs stating that same file is not found.

Detailed Steps to Reproduce the Behavior

  1. cd ~
  2. mkdir leandojotest
  3. cd leandojotest
  4. python -m venv .
  5. source ./bin/activate
  6. pip install lean-dojo
  7. Set $DST_DIR to some value where the traced repo will be.
  8. Make a new file in ~/leandojotest named test.py with the following contents:
    from lean_dojo import LeanGitRepo, trace
    repo = LeanGitRepo("https://github.com/realharryhero/LeanSmallTheoremRepo", "05360508905a431e8358b26971334828a3a0767e")
    trace(repo, dst_dir='~/$DST_DIR')
  9. python test.py

Logs in Debug Mode

Logs

2024-06-21 23:01:34.726 | INFO | lean_dojo.data_extraction.trace:get_traced_repo_path:79 - Tracing LeanGitRepo(url='https://github.com/realharryhero/LeanSmallTheoremRepo', commit='05360508905a431e8358b26971334828a3a0767e') 2024-06-21 23:01:35.444 | INFO | __main__:main:165 - Building LeanSmallTheoremRepo info: std: cloning https://github.com/leanprover/std4 to './.lake/packages/std' info: Qq: cloning https://github.com/leanprover-community/quote4 to './.lake/packages/Qq' info: aesop: cloning https://github.com/leanprover-community/aesop to './.lake/packages/aesop' info: proofwidgets: cloning https://github.com/leanprover-community/ProofWidgets4 to './.lake/packages/proofwidgets' info: Cli: cloning https://github.com/leanprover/lean4-cli to './.lake/packages/Cli' info: importGraph: cloning https://github.com/leanprover-community/import-graph.git to './.lake/packages/importGraph' [0/3] Downloading proofwidgets cloud release [0/26] Building Std.Lean.Position [0/26] Building Std.CodeAction.Attr [0/28] Building Std.Lean.Meta.Basic [0/37] Building Std.Data.Nat.Basic [0/43] Building Std.Lean.NameMapAttribute [0/43] Building Std.Tactic.OpenPrivate [0/43] Building Std.Tactic.Lint.Basic [0/99] Building Std.Lean.Expr [0/99] Building Std.Util.LibraryNote [0/102] Building Std.Tactic.Relation.Rfl [0/102] Building Std.Data.List.Basic [1/3117] Compiling Std.Util.LibraryNote [2/3117] Compiling Std.Lean.NameMapAttribute [3/3117] Compiling Std.Lean.Position [4/3117] Compiling Std.Lean.Meta.Basic [5/3117] Building Std.Lean.Name [6/3117] Building Std.Classes.BEq [6/3117] Unpacking proofwidgets cloud release [7/3117] Building Std.Classes.Order [8/3117] Building Std.Classes.SatisfiesM [9/3117] Building Std.Control.ForInStep.Basic [10/3117] Building Std.Control.Lemmas [11/3117] Building Std.Data.MLList.Basic [12/3117] Building Std.Data.List.Init.Attach [13/3117] Building Std.Data.Fin.Basic [14/3117] Building Std.Tactic.SeqFocus [15/3117] Building Std.Util.ProofWanted [16/3117] Building Std.Data.Char [17/3117] Building Std.Data.DList [18/3117] Building Std.Data.LazyList [19/3117] Building Std.Data.Sum.Basic [20/3117] Building Std.Data.UInt [21/3117] Building Std.Lean.TagAttribute [22/3117] Building Std.Lean.Delaborator [23/3117] Building Std.Lean.Except [24/3117] Building Std.Lean.Float [25/3117] Building Std.Lean.HashMap [26/3117] Building Std.Lean.HashSet [27/3117] Building Std.Lean.IO.Process [28/3117] Building Std.Lean.Meta.Expr [29/3117] Building Std.Lean.PersistentHashMap [30/3117] Building Std.Lean.Meta.Inaccessible [31/3117] Building Std.Lean.MonadBacktrack [32/3117] Building Std.Lean.NameMap [33/3117] Building Std.Lean.PersistentHashSet [34/3117] Building Std.Lean.SMap [35/3117] Building Std.Lean.Syntax [36/3117] Building Std.Lean.Util.Path [37/3117] Building Std.Tactic.Unreachable [38/3117] Building Std.Tactic.Case [39/3117] Building Std.Tactic.Classical [40/3117] Building Std.Tactic.Congr [41/3117] Building Std.Tactic.Exact [42/3117] Building Std.Tactic.Instances [43/3117] Building Std.Tactic.NoMatch [44/3117] Building Std.Tactic.SqueezeScope [45/3117] Building Std.Tactic.Where [46/3117] Building Std.Test.Internal.DummyLabelAttr _(not part of log: the number 46 above keeps increasing incrementally...)_ [1065/3117] Building Aesop [1066/3117] Compiling Aesop [3113/3117] Building Main stdout: ./././Main.lean:160:7: warning: unused variable `h1` [linter.unusedVariables] ./././Main.lean:194:7: warning: unused variable `h1` [linter.unusedVariables] Try this: exact abs_abs_sub_abs_le_abs_sub x y ./././Main.lean:514:68: warning: unused variable `h₁` [linter.unusedVariables] ./././Main.lean:517:37: warning: unused variable `h₀` [linter.unusedVariables] ./././Main.lean:517:68: warning: unused variable `h₁` [linter.unusedVariables] [3116/3117] Compiling Main [3117/3117] Linking leanprojectcoolio2 2024-06-21 23:02:51.251 | INFO | __main__:main:188 - Tracing LeanSmallTheoremRepo 2024-06-21 23:02:51.264 | DEBUG | __main__:main:193 - lake env lean --threads 12 --run ExtractData.lean 97%|█████████████████████████████████████████████████████████████▉ | 1086/1122 [07:15<00:12, 2.96it/s]2024-06-21 23:10:08.917 | WARNING | __main__:check_files:132 - Missing /tmp/tmp_pa6ep1e/workspace/LeanSmallTheoremRepo/.lake/build/ir/Main.dep_paths 2024-06-21 23:10:11,883 INFO worker.py:1744 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265 16%|██████████▌ | 182/1116 [00:06<00:35, 26.16it/s] Traceback (most recent call last): File "/home/mfan/lean_docs_scrape_leandojo2/usingLeanDojo2.py", line 3, in trace(repo, dst_dir='/home/mfan/imsocool') File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/lean_dojo/data_extraction/trace.py", line 115, in trace cached_path = get_traced_repo_path(repo, build_deps) File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/lean_dojo/data_extraction/trace.py", line 83, in get_traced_repo_path traced_repo = TracedRepo.from_traced_files(tmp_dir / repo.name, build_deps) File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/lean_dojo/data_extraction/traced_data.py", line 1108, in from_traced_files traced_files = list( File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__ for obj in iterable: File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/ray/util/actor_pool.py", line 170, in get_generator yield self.get_next_unordered() File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/ray/util/actor_pool.py", line 370, in get_next_unordered return ray.get(future) File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper return fn(*args, **kwargs) File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, **kwargs) File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/ray/_private/worker.py", line 2613, in get values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout) File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/ray/_private/worker.py", line 861, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(FileNotFoundError): ray::_TracedRepoHelper.parse_traced_file() (pid=3585324, ip=192.168.1.40, actor_id=ff29417d681fa20ef2187a2701000000, repr=) File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/lean_dojo/data_extraction/traced_data.py", line 974, in parse_traced_file return TracedFile.from_traced_file(self.root_dir, path, self.repo) File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/lean_dojo/data_extraction/traced_data.py", line 523, in from_traced_file return cls._from_lean4_traced_file(root_dir, json_path, repo) File "/home/mfan/lean_docs_scrape_leandojo2/lib/python3.10/site-packages/lean_dojo/data_extraction/traced_data.py", line 536, in _from_lean4_traced_file json_path.with_suffix("").with_suffix("").with_suffix(".dep_paths").open() File "/usr/lib/python3.10/pathlib.py", line 1119, in open return self._accessor.open(self, mode, buffering, encoding, errors, FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp_pa6ep1e/LeanSmallTheoremRepo/.lake/build/ir/Main.dep_paths'

Platform Information

Thanks for the LEAN tool!

realharryhero commented 3 months ago

Closing for now as actually I believe one of the dependencies are in my local directory; that is, I used the dependency mathlib4 from my local directory ~/mathlib4, which was the v4.7.0 version of mathlib. Most likely the dep_path tool only (as it maybe should) checks git repositories.

I think I can solve this problem by either creating a new v4.7.0 version of mathlib on github or manually adding ~/mathlib4 to dep_path for some .lean files. Thank you!