lean-dojo / LeanDojo

Tool for data extraction and interacting with Lean programmatically.
https://leandojo.org
MIT License
536 stars 80 forks source link

Compatibility of Python versions #38

Closed samminweng closed 1 year ago

samminweng commented 1 year ago

Description What happened?

The Docker image is using Python 3.11, but LeanDojo requires Python <=3.10

Detailed Steps to Reproduce the Behavior I ran the Dockerfile in the docker folder, which installed Python 3.11. However, when I entered the terminal of the Docker container and used pip install lean-dojo, I received the following error messages:

[root@975563dee4b6 workspace]# pip install lean-dojo ERROR: Ignored the following versions that require a different python version: 1.0.0 Requires-Python <3.11,>=3.9; 1.1.0 Requires-Python <3.11,>=3.9; 1.1.1 Requires-Python <3.11,>=3.9; 1.1.2 Requires-Python <3.11,>=3.9; 1.2.0 Requires-Python <3.11,>=3.9 ERROR: Could not find a version that satisfies the requirement lean-dojo (from versions: none) ERROR: No matching distribution found for lean-dojo

Is there any workaround to solve this issue? Thanks.

Screenshot 2023-08-15 155344

Platform Information

yangky11 commented 1 year ago

Hi,

docker/Dockerfile is not supposed to be used in that way. To use LeanDojo with, you just have to make sure docker is installed and running. You don't need to build any docker container yourself.

LeanDojo requires Python<3.11 because it depends on Ray, which is experimental for Python 3.11: https://docs.ray.io/en/latest/ray-overview/installation.html. We could also switch to Python<=3.11 if that becomes an issue.

samminweng commented 1 year ago

Hi @yangky11,

Thanks for the explanation. I think the PyCharm project configuration is causing the problem. Would it be helpful to have simple instructions on how to set up LeanDojo with docker? Any suggestion about IDE?

Thanks.

Here is how I configured the project under PyCharmIDE (1) Use PyCharm IDE to clone LeanDojo project (2) Configure the project's interpreter to start the container from Dockerfile inside 'docker' folder' image

(3) Set the project interpret to use the python inside docker container.
image

(4) Create an example.py file image

yangky11 commented 1 year ago

I have never used PyCharm and thus am unable to help with that. LeanDojo doesn't need any manual setup with Docker. You can just open a terminal and try docker version. It's OK as long as it succeeds without errors such as Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

samminweng commented 1 year ago

Thank you so much for the suggestion. After several tries, I think the problem may be caused by the Windows system preventing Python from accessing Docker.

One workaround would be to use Windows Subsystem for Linux (WSL) to run example.py. This works well, but it seems to encounter an issue when starting a local Ray instance.

(1) My platform is Window 11 (WSL). Then install python (3.9) and docker.

(2) Create a normal user account ('leandojo') and add this user to docker group how to. This avoids the [GitRepo issue] (https://github.com/lean-dojo/LeanDojo/issues/29)

Check if my docker is running by typing docker version

Client: Docker Engine - Community
 Version:           24.0.5
 API version:       1.43
 ....
 OS/Arch:           linux/amd64
 Context:           default

(2) Then I run python and copy the code of example.py from Getting started tutorial.

from lean_dojo import LeanGitRepo, trace

repo = LeanGitRepo("https://github.com/yangky11/lean-example", "5a0360e49946815cb53132638ccdd46fb1859e2a")
trace(repo, dst_dir="traced_lean-example")

Still encountered another issue

leandojo@DESKTOP-E3CL2TR:~$ python3.9
Python 3.9.17 (main, Jun  6 2023, 20:11:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from lean_dojo import LeanGitRepo, trace
po = LeanGitRepo("https://github.com/yangky11/lean-example", "5a0360e49946815cb53132638ccdd46fb1859e2a")
trace(repo, dst_dir="traced_lean-example")
>>>
>>> repo = LeanGitRepo("https://github.com/yangky11/lean-example", "5a0360e49946815cb53132638ccdd46fb1859e2a")
>>> trace(repo, dst_dir="traced_lean-example")
2023-08-16 13:36:07.425 | INFO     | lean_dojo.data_extraction.trace:trace:164 - Loading the traced repo from /home/leandojo/.cache/lean_dojo/yangky11-lean-example-5a0360e49946815cb53132638ccdd46fb1859e2a/lean-example
2023-08-16 13:36:09,735 ERROR services.py:1207 -- Failed to start the dashboard , return code 1
2023-08-16 13:36:09,735 ERROR services.py:1232 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2023-08-16 13:36:09,735 ERROR services.py:1276 --
The last 20 lines of /tmp/ray/session_2023-08-16_13-36-07_766613_863/logs/dashboard.log (it contains the error message from the dashboard):
  File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/usr/local/lib/python3.9/dist-packages/ray/dashboard/modules/log/log_manager.py", line 8, in <module>
    from ray.util.state.common import (
  File "/usr/local/lib/python3.9/dist-packages/ray/util/state/__init__.py", line 1, in <module>
    from ray.util.state.api import (
  File "/usr/local/lib/python3.9/dist-packages/ray/util/state/api.py", line 17, in <module>
    from ray.util.state.common import (
  File "/usr/local/lib/python3.9/dist-packages/ray/util/state/common.py", line 120, in <module>
    @dataclass(init=True)
  File "/usr/local/lib/python3.9/dist-packages/pydantic/dataclasses.py", line 139, in dataclass
    assert init is False, 'pydantic.dataclasses.dataclass only supports init=False'
AssertionError: pydantic.dataclasses.dataclass only supports init=False
2023-08-16 13:36:09,860 INFO worker.py:1636 -- Started a local Ray instance.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/lean_dojo/data_extraction/trace.py", line 165, in trace
    traced_repo = TracedRepo.load_from_disk(cached_path)
  File "/usr/local/lib/python3.9/dist-packages/lean_dojo/data_extraction/traced_data.py", line 1282, in load_from_disk
    with ray_actor_pool(_TracedRepoHelper, root_dir, repo) as pool:
  File "/usr/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.9/dist-packages/lean_dojo/utils.py", line 74, in ray_actor_pool
    pool = ActorPool([actor_cls.remote(*args, **kwargs) for _ in range(NUM_WORKERS)])
  File "/usr/local/lib/python3.9/dist-packages/lean_dojo/utils.py", line 74, in <listcomp>
    pool = ActorPool([actor_cls.remote(*args, **kwargs) for _ in range(NUM_WORKERS)])
  File "/usr/local/lib/python3.9/dist-packages/ray/actor.py", line 535, in remote
    return self._remote(args=args, kwargs=kwargs, **self._default_options)
  File "/usr/local/lib/python3.9/dist-packages/ray/util/tracing/tracing_helper.py", line 385, in _invocation_actor_class_remote_span
    return method(self, args, kwargs, *_args, **_kwargs)
  File "/usr/local/lib/python3.9/dist-packages/ray/actor.py", line 970, in _remote
    actor_id = worker.core_worker.create_actor(
  File "python/ray/_raylet.pyx", line 2966, in ray._raylet.CoreWorker.create_actor
  File "python/ray/_raylet.pyx", line 2971, in ray._raylet.CoreWorker.create_actor
  File "python/ray/_raylet.pyx", line 671, in ray._raylet.prepare_args_and_increment_put_refs
  File "python/ray/_raylet.pyx", line 662, in ray._raylet.prepare_args_and_increment_put_refs
  File "python/ray/_raylet.pyx", line 708, in ray._raylet.prepare_args_internal
  File "/usr/local/lib/python3.9/dist-packages/ray/_private/worker.py", line 618, in get_serialization_context
    context_map[job_id] = serialization.SerializationContext(self)
  File "/usr/local/lib/python3.9/dist-packages/ray/_private/serialization.py", line 151, in __init__
    serialization_addons.apply(self)
  File "/usr/local/lib/python3.9/dist-packages/ray/util/serialization_addons.py", line 58, in apply
    register_pydantic_serializer(serialization_context)
  File "/usr/local/lib/python3.9/dist-packages/ray/util/serialization_addons.py", line 21, in register_pydantic_serializer
    pydantic.fields.ModelField,
AttributeError: module 'pydantic.fields' has no attribute 'ModelField'
>>> (raylet) [2023-08-16 13:36:10,938 E 983 1015] (raylet) agent_manager.cc:135: The raylet exited immediately because the Ray agent failed. The raylet fate shares with the agent. This can happen because the Ray agent was unexpectedly killed or failed. Agent can fail when
(raylet) - The version of `grpcio` doesn't follow Ray's requirement. Agent can segfault with the incorrect `grpcio` version. Check the grpcio version `pip freeze | grep grpcio`.
(raylet) - The agent failed to start because of unexpected error or port conflict. Read the log `cat /tmp/ray/session_latest/logs/dashboard_agent.log`. You can find the log file structure here https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure.
(raylet) - The agent is killed by the OS (e.g., out of memory).

Thanks.

yangky11 commented 1 year ago

This could be related to a previous issue with ray and pydantic: https://github.com/ray-project/ray/issues/37019. I'm not sure if they have fixed the problem. You can try again after making sure both LeanDojo (1.2) and ray (2.6) are their lastest versions. If it doesn't work, maybe try the temporary fix in https://github.com/ray-project/ray/issues/37019.

samminweng commented 1 year ago

Hi @yangky11

The workaround pip install "pydantic<2" fixes the issue and tracked lean examples. I now have the folder traced_lean-example Thanks.