lean-dojo / LeanDojo

Tool for data extraction and interacting with Lean programmatically.
https://leandojo.org
MIT License
536 stars 80 forks source link

Error Loading `lean_dojo` due to Git Version Parsing Issue[Windows] #186

Closed RexWzh closed 1 month ago

RexWzh commented 1 month ago

Description

Failed to load lean_dojo in window, due to the regex rule for git string.

Detailed Steps to Reproduce the Behavior

import lean_dojo

Logs in Debug Mode Set the environment variable VERBOSE=1 and paste the logs here.

(dojo) C:\Users\cubenlp>ver

Microsoft Windows [版本 10.0.19045.4474]

(dojo) C:\Users\cubenlp>python -V
Python 3.10.14

(dojo) C:\Users\cubenlp>python
Python 3.10.14 | packaged by Anaconda, Inc. | (main, May  6 2024, 19:44:50) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import lean_dojo
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\cubenlp\.conda\envs\dojo\lib\site-packages\lean_dojo\__init__.py", line 3, in <module>
    from .data_extraction.trace import (
  File "C:\Users\cubenlp\.conda\envs\dojo\lib\site-packages\lean_dojo\data_extraction\trace.py", line 19, in <module>
    from .cache import cache
  File "C:\Users\cubenlp\.conda\envs\dojo\lib\site-packages\lean_dojo\data_extraction\cache.py", line 13, in <module>
    from ..utils import (
  File "C:\Users\cubenlp\.conda\envs\dojo\lib\site-packages\lean_dojo\utils.py", line 20, in <module>
    from .constants import NUM_WORKERS, TMP_DIR, LEAN4_PACKAGES_DIR, LEAN4_BUILD_DIR
  File "C:\Users\cubenlp\.conda\envs\dojo\lib\site-packages\lean_dojo\constants.py", line 90, in <module>
    check_git_version((2, 25, 0))
  File "C:\Users\cubenlp\.conda\envs\dojo\lib\site-packages\lean_dojo\constants.py", line 81, in check_git_version
    version = tuple(int(_) for _ in m["version"].split("."))
  File "C:\Users\cubenlp\.conda\envs\dojo\lib\site-packages\lean_dojo\constants.py", line 81, in <genexpr>
    version = tuple(int(_) for _ in m["version"].split("."))
ValueError: invalid literal for int() with base 10: ''

Platform Information


The regex in check_git_version

m = re.match(r"git version (?P<version>[0-9.]+)", output)
version = tuple(int(_) for _ in m["version"].split("."))

Fix version:

m = re.match(r"git version (?P<version>[0-9.]+)", output)
version = tuple(int(_) for _ in m["version"].split("."))

Not sure if the project is planning to support Windows systems. There seems to be other issues, such as the use of os.geteuid() which only functions on Unix-like systems.

https://github.com/lean-dojo/LeanDojo/blob/2ff92a470550cb3d294448c18ee9c72f87392aef/src/lean_dojo/__init__.py#L32

yangky11 commented 1 month ago

Thanks for reporting this. Would you mind submit a PR?

It should be fairly easy to support Windows, though I have never tested it on Windows. The os.geteuid() is not no longer needed, and I just removed it from the main branch. It was there only for historically reasons.

RexWzh commented 1 month ago

188