microsoft / OmniParser

A simple screen parsing tool towards pure vision based GUI agent
Creative Commons Attribution 4.0 International
4.88k stars 370 forks source link

Doesn't work on python 3.12, 3.11; numpy<2; No module named 'distutils' #63

Open n-sviridenko opened 3 weeks ago

n-sviridenko commented 3 weeks ago

Hi,

With python 3.12 it doesn't let installing all deps:

(omni) ➜  OmniParser git:(master) pyenv local 3.12

(omni) ➜  OmniParser git:(master) ✗ pip install -r requirements.txt

Collecting torch (from -r requirements.txt (line 1))
  Downloading torch-2.2.2-cp312-none-macosx_10_9_x86_64.whl.metadata (25 kB)
Collecting easyocr (from -r requirements.txt (line 2))
  Using cached easyocr-1.7.2-py3-none-any.whl.metadata (10 kB)
Collecting torchvision (from -r requirements.txt (line 3))
  Downloading torchvision-0.17.2-cp312-cp312-macosx_10_13_x86_64.whl.metadata (6.6 kB)
Collecting supervision==0.18.0 (from -r requirements.txt (line 4))
  Downloading supervision-0.18.0-py3-none-any.whl.metadata (12 kB)
Collecting openai==1.3.5 (from -r requirements.txt (line 5))
  Downloading openai-1.3.5-py3-none-any.whl.metadata (16 kB)
Collecting transformers (from -r requirements.txt (line 6))
  Downloading transformers-4.46.1-py3-none-any.whl.metadata (44 kB)
Collecting ultralytics==8.1.24 (from -r requirements.txt (line 7))
  Downloading ultralytics-8.1.24-py3-none-any.whl.metadata (40 kB)
Collecting azure-identity (from -r requirements.txt (line 8))
  Downloading azure_identity-1.19.0-py3-none-any.whl.metadata (80 kB)
Collecting numpy (from -r requirements.txt (line 9))
  Downloading numpy-2.1.2-cp312-cp312-macosx_10_13_x86_64.whl.metadata (60 kB)
Collecting opencv-python (from -r requirements.txt (line 10))
  Using cached opencv-python-4.10.0.84.tar.gz (95.1 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done

[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: pip install --upgrade pip
ERROR: Exception:
Traceback (most recent call last):
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/cli/base_command.py", line 105, in _run_wrapper
    status = _inner_run()
             ^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/cli/base_command.py", line 96, in _inner_run
    return self.run(options, args)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/cli/req_command.py", line 67, in wrapper
    return func(self, options, args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/commands/install.py", line 379, in run
    requirement_set = resolver.resolve(
                      ^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 95, in resolve
    result = self._result = resolver.resolve(
                            ^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_vendor/resolvelib/resolvers.py", line 546, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_vendor/resolvelib/resolvers.py", line 397, in resolve
    self._add_to_criteria(self.state.criteria, r, parent=None)
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_vendor/resolvelib/resolvers.py", line 173, in _add_to_criteria
    if not criterion.candidates:
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_vendor/resolvelib/structs.py", line 156, in __bool__
    return bool(self._sequence)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 174, in __bool__
    return any(self)
           ^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 162, in <genexpr>
    return (c for c in iterator if id(c) not in self._incompatible_ids)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 53, in _iter_built
    candidate = func()
                ^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 186, in _make_candidate_from_link
    base: Optional[BaseCandidate] = self._make_base_candidate_from_link(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 232, in _make_base_candidate_from_link
    self._link_candidate_cache[link] = LinkCandidate(
                                       ^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 303, in __init__
    super().__init__(
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 158, in __init__
    self.dist = self._prepare()
                ^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 235, in _prepare
    dist = self._prepare_distribution()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 314, in _prepare_distribution
    return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/operations/prepare.py", line 527, in prepare_linked_requirement
    return self._prepare_linked_requirement(req, parallel_builds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/operations/prepare.py", line 642, in _prepare_linked_requirement
    dist = _get_prepared_distribution(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/operations/prepare.py", line 72, in _get_prepared_distribution
    abstract_dist.prepare_distribution_metadata(
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/distributions/sdist.py", line 56, in prepare_distribution_metadata
    self._install_build_reqs(finder)
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/distributions/sdist.py", line 126, in _install_build_reqs
    build_reqs = self._get_build_requires_wheel()
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/distributions/sdist.py", line 103, in _get_build_requires_wheel
    return backend.get_requires_for_build_wheel()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_internal/utils/misc.py", line 706, in get_requires_for_build_wheel
    return super().get_requires_for_build_wheel(config_settings=cs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_impl.py", line 166, in get_requires_for_build_wheel
    return self._call_hook('get_requires_for_build_wheel', {
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_impl.py", line 321, in _call_hook
    raise BackendUnavailable(data.get('traceback', ''))
pip._vendor.pyproject_hooks._impl.BackendUnavailable: Traceback (most recent call last):
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 77, in _build_backend
    obj = import_module(mod_path)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.12.5/lib/python3.12/importlib/__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1310, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 995, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "/private/var/folders/m1/knz4fpzx1sj7sv1xvnwxzb_w0000gn/T/pip-build-env-s3jau2wr/overlay/lib/python3.12/site-packages/setuptools/__init__.py", line 10, in <module>
    import distutils.core
ModuleNotFoundError: No module named 'distutils'

With 3.11 it fails during tensors conversion (probably related to https://github.com/microsoft/OmniParser/issues/25):

(omni) ➜  OmniParser git:(master) ✗ pyenv local 3.11

(omni) ➜  OmniParser git:(master) ✗ python weights/convert_safetensor_to_pt.py

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/Users/nsviridenko/ws/OmniParser/weights/convert_safetensor_to_pt.py", line 1, in <module>
    import torch
  File "/Users/nsviridenko/.pyenv/versions/3.11.5/lib/python3.11/site-packages/torch/__init__.py", line 1477, in <module>
    from .functional import *  # noqa: F403
  File "/Users/nsviridenko/.pyenv/versions/3.11.5/lib/python3.11/site-packages/torch/functional.py", line 9, in <module>
    import torch.nn.functional as F
  File "/Users/nsviridenko/.pyenv/versions/3.11.5/lib/python3.11/site-packages/torch/nn/__init__.py", line 1, in <module>
    from .modules import *  # noqa: F403
  File "/Users/nsviridenko/.pyenv/versions/3.11.5/lib/python3.11/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
    from .transformer import TransformerEncoder, TransformerDecoder, \
  File "/Users/nsviridenko/.pyenv/versions/3.11.5/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
    device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
/Users/nsviridenko/.pyenv/versions/3.11.5/lib/python3.11/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
  device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
Traceback (most recent call last):
  File "/Users/nsviridenko/ws/OmniParser/weights/convert_safetensor_to_pt.py", line 5, in <module>
    tensor_dict = load_file("weights/icon_detect/model.safetensors")
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nsviridenko/.pyenv/versions/3.11.5/lib/python3.11/site-packages/safetensors/torch.py", line 313, in load_file
    with safe_open(filename, framework="pt", device=device) as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: No such file or directory: "weights/icon_detect/model.safetensors"

How to make this beautiful thing work?

n-sviridenko commented 3 weeks ago

May be relevant https://github.com/microsoft/OmniParser/issues/33

n-sviridenko commented 3 weeks ago

Yes, #33 didn't solve the Numpy issue. It runs on Python 3.11, but still during Runtime it throws this RuntimeError: Numpy is not available.

So #25 is still unsolved.

abrichr commented 3 weeks ago

See https://github.com/microsoft/OmniParser/pull/52 for a working Dockerfile:

sudo nvidia-docker build -t omniparser .
sudo docker run -d -p 7861:7861 --gpus all --name omniparser-container omniparser
n-sviridenko commented 3 weeks ago

@abrichr can I do this w/o CUDA?

abrichr commented 3 weeks ago

I believe in the current state, CUDA is a requirement. (@yadong-lu can you please clarify?)

However https://github.com/microsoft/OmniParser/pull/52 includes a deploy.py file that will automate deployment to EC2 on AWS for you. All you need is:

AWS_ACCESS_KEY_ID=<your aws access key id>
AWS_SECRET_ACCESS_KEY=<your aws secret access key (required)>
AWS_REGION=<your aws region (required)>
GITHUB_OWNER=<your github owner (required)>  # e.g. n-sviridenko
GITHUB_REPO=<your github repo (required)>    # e.g. OmniParse
GITHUB_TOKEN=<your github token (required)>

It costs about $10/day on g4dn.xlarge with 100GB (the default).

Then all you need is:

# on your local machine, no CUDA required:        
python3 -m venv venv && source venv/bin/activate && pip install -r deploy_requirements.txt
python deploy.py start
python client.py "http://<server_ip>:7861"
python deploy.py [pause | stop]
yadong-lu commented 3 weeks ago

I believe in the current state, CUDA is a requirement. (@yadong-lu can you please clarify?)

However #52 includes a deploy.py file that will automate deployment to EC2 on AWS for you. All you need is:

AWS_ACCESS_KEY_ID=<your aws access key id>
AWS_SECRET_ACCESS_KEY=<your aws secret access key (required)>
AWS_REGION=<your aws region (required)>
GITHUB_OWNER=<your github owner (required)>  # e.g. n-sviridenko
GITHUB_REPO=<your github repo (required)>    # e.g. OmniParse
GITHUB_TOKEN=<your github token (required)>

It costs about $10/day on g4dn.xlarge with 100GB (the default).

Then all you need is:

# on your local machine, no CUDA required:        
python3 -m venv venv && source venv/bin/activate && pip install -r deploy_requirements.txt
python deploy.py start
python client.py "http://<server_ip>:7861"
python deploy.py [pause | stop]

I had try cpu before and it works (just slow). Also @n-sviridenko feel free to try out our demo: https://huggingface.co/spaces/microsoft/OmniParser

nmstoker commented 3 weeks ago

@n-sviridenko My reading of this is it's less about OmniParser and more that you've not been able to install a working version of numpy under Python 3.12

Given your error messages, I suspect this is due to distutils having been deprecated in Python and finally removed in 3.12 - which may explain why those using 3.11 don't experience this.

It's worth googling numpy / distutils and considering options for your particular setup (I suggest you try to go with more recent advice rather than the first thing you find since older distutils issues may have other causes prior to the 3.12 removal)

There's a little background here: https://numpy.org/devdocs/reference/distutils_status_migration.html