scribe-org / Scribe-Data

Wikidata, Wiktionary and Wikipedia language data extraction
GNU General Public License v3.0
23 stars 25 forks source link

Make PyICU dependency installation optional #196

Open andrewtavis opened 2 weeks ago

andrewtavis commented 2 weeks ago

Terms

Description

PyICU has proven to be very problematic for the community, and really doesn't have a good deployment for Windows. Because of this, we'd like to make its addition as a deployment optional, where without installing it the CLI's emoji keyword functionalities wouldn't work.

We'll need to:

Contribution

Happy to support with this or get to it myself at some point! Let's first discuss what would be an effective way to have the install be optional :)

axif0 commented 2 weeks ago

I'm using windows. and when I run the codebase, i got the error - image In requirement.txt, I modified thePyICU>=2.10.2; extra == 'emoji'. Then I don't get any error.

image

To check the emoji in get.py Can you please provide me the cli command so that I can verify it?

Or in installation process, should we add message like "Additional PyICU for 'emoji' will be installed. Do you wish to install it? (y/n)"

axif0 commented 2 weeks ago

If I do pytest then I got error in test_check_query.py

image

andrewtavis commented 2 weeks ago

Thanks for checking this, @axif0! I think that the test_check_query.py errors is a separate issue. Let me make it, and maybe you'd want to work on that? I'll ping you in there :)

As far as this issue, can we figure out a way to disable the emoji_keywords option? Basically with PyICU>=2.10.2; extra == 'emoji' you're saying that pip install -e . --emoji or something similar will then also install PyICU? From there we'd need to then edit the package to disable the option to use emoji-keywoards as a --data-type option 🤔 Maybe that could be the trick? Maybe we can figure out a boolean option that requires PyICU, and from there we can put an assertion at the top of commands such that when the condition to use emoji-keywards isn't met, then there's an error and the user is prompted to install PyICU?

CC @wkyoshida and @mhmohona :)

andrewtavis commented 2 weeks ago

Assigning you here for now, @axif0 :)

mhmohona commented 2 weeks ago

@axif0 could you install Scribe-data successfully? I couldnt even run pip install -r requirements.txt command. :( Here is the error I am facing -

PS C:\Users\mhmoh\OneDrive\Desktop\Scribe-Data> pip install --upgrade pip
Requirement already satisfied: pip in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (24.0)
Collecting pip
  Downloading pip-24.2-py3-none-any.whl.metadata (3.6 kB)
Downloading pip-24.2-py3-none-any.whl (1.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 165.0 kB/s eta 0:00:00
Installing collected packages: pip
  Attempting uninstall: pip
  Attempting uninstall: pip
    Found existing installation: pip 24.0
    Uninstalling pip-24.0:
      Successfully uninstalled pip-24.0
  WARNING: The scripts pip.exe, pip3.11.exe and pip3.exe are installed in 'C:\Users\mhmoh\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\Scripts' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-24.2
PS C:\Users\mhmoh\OneDrive\Desktop\Scribe-Data> pip install -r requirements.txt                        
Requirement already satisfied: beautifulsoup4==4.9.3 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 1)) (4.9.3)
Requirement already satisfied: certifi>=2020.12.5 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 2)) (2024.2.2)
Requirement already satisfied: defusedxml==0.7.1 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 3)) (0.7.1)
Requirement already satisfied: emoji>=2.2.0 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 4)) (2.11.0)
Collecting flax>=0.8.2 (from -r requirements.txt (line 5))
  Using cached flax-0.9.0-py3-none-any.whl.metadata (11 kB)
Collecting iso639-lang>=2.2.3 (from -r requirements.txt (line 6))
  Using cached iso639_lang-2.3.0-py3-none-any.whl.metadata (7.3 kB)
Requirement already satisfied: m2r2>=0.3.3 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 7)) (0.3.3.post2)
Requirement already satisfied: mwparserfromhell>=0.6 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 8)) (0.6.6)
Requirement already satisfied: numpydoc>=1.6.0 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 9)) (1.6.0)
Requirement already satisfied: packaging>=20.9 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 10)) (24.0)
Requirement already satisfied: pandas>=1.5.3 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 11)) (2.2.1)
Collecting pre-commit>=3.7.1 (from -r requirements.txt (line 12))
  Using cached pre_commit-3.8.0-py2.py3-none-any.whl.metadata (1.3 kB)
Requirement already satisfied: pyarrow>=15.0.0 in c:\users\mhmoh\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages (from -r requirements.txt (line 13)) (15.0.2)
Collecting PyICU>=2.10.2 (from -r requirements.txt (line 15))
  Using cached PyICU-2.13.1.tar.gz (262 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [66 lines of output]
      (running 'icu-config --version')
      (running 'pkg-config --modversion icu-i18n')
      Traceback (most recent call last):
        File "<string>", line 89, in <module>
        File "<frozen os>", line 679, in __getitem__
      KeyError: 'ICU_VERSION'

      During handling of the above exception, another exception occurred:

      Traceback (most recent call last):
        File "<string>", line 92, in <module>
        File "<string>", line 19, in check_output
        File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 466, in check_output
          return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 548, in run
          with Popen(*popenargs, **kwargs) as process:
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 1026, in __init__
          self._execute_child(args, executable, preexec_fn, close_fds,
        File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 1538, in _execute_child
          hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      FileNotFoundError: [WinError 2] The system cannot find the file specified

      During handling of the above exception, another exception occurred:

      Traceback (most recent call last):
        File "<string>", line 96, in <module>
        File "<string>", line 19, in check_output
        File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 466, in check_output
          return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 548, in run
          with Popen(*popenargs, **kwargs) as process:
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 1026, in __init__
          self._execute_child(args, executable, preexec_fn, close_fds,
        File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 1538, in _execute_child
          hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      FileNotFoundError: [WinError 2] The system cannot find the file specified

      During handling of the above exception, another exception occurred:

      Traceback (most recent call last):
        File "C:\Users\mhmoh\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
es\Python311\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\mhmoh\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\mhmoh\AppData\Local\Temp\pip-build-env-xsbx7b35\overlay\Lib\site-packages\setuptools\build_meta.py", line 332, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=[])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "C:\Users\mhmoh\AppData\Local\Temp\pip-build-env-xsbx7b35\overlay\Lib\site-packages\setuptools\build_meta.py", line 302, in _get_build_requires        
          self.run_setup()
        File "C:\Users\mhmoh\AppData\Local\Temp\pip-build-env-xsbx7b35\overlay\Lib\site-packages\setuptools\build_meta.py", line 318, in run_setup
          exec(code, locals())
        File "<string>", line 99, in <module>
      RuntimeError:
      Please install pkg-config on your system or set the ICU_VERSION environment
      variable to the version of ICU you have installed.

      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Pasted full output so @andrewtavis can have a look at it.

andrewtavis commented 2 weeks ago

Was that with PyICU>=2.10.2; extra == 'emoji' in the requirements file, @mhmohona, or just as is?

Let's definitely look into how to disable something in the package based on emoji not being passed to the installation :)

mhmohona commented 1 week ago

It was before making the edit @andrewtavis. After making edit things work without any error.

andrewtavis commented 1 week ago

Super great :) We can maybe discuss in the sync how we can have a boolean value based on the install flag 😊