cisco / mindmeld

An Open Source Conversational AI Platform for Deep-Domain Voice Interfaces and Chatbots.
http://mindmeld.com
Apache License 2.0
677 stars 186 forks source link

Cannot Install MindMeld On `arm64` Platforms Using `pip` #422

Open Zozman opened 2 years ago

Zozman commented 2 years ago

Issue

I decided that I wanted to try to run MindMeld on a Raspberry Pi and quickly found out that attempting to run pip install mindmeld in an arm64 environment does not work. This appears to be because the dependencies in setup.py do not all have compiled arm64 wheels and when pip attempts to manually build them from source the builds fail (scikit-learn in particular).

This makes one unable to run MindMeld on devices like the M1 Mac and Raspberry Pi.

Reproduction

To easily reproduce this on any system (no matter the architecture), you can perform a linux/arm64 build of a Docker container and attempt to install MindMeld into it and run into the issue. To do this:

1) Create a Dockerfile with the following:

FROM python:3.7.13

RUN pip install mindmeld

2) Run docker buildx build --progress=plain --platform linux/arm64 . to perform an arm64 build.

The following occurs when pip attempts to install scikit-learn as a dependency:

#5 4637.0   Building wheel for scikit-learn (setup.py): started
#5 4639.0   Building wheel for scikit-learn (setup.py): finished with status 'error'
#5 4639.1   error: subprocess-exited-with-error
#5 4639.1
#5 4639.1   × python setup.py bdist_wheel did not run successfully.
#5 4639.1   │ exit code: 1
#5 4639.1   ╰─> [20 lines of output]
#5 4639.1       Partial import of sklearn during the build process.
#5 4639.1       Traceback (most recent call last):
#5 4639.1         File "/tmp/pip-install-e25elt4h/scikit-learn_72ba595d87a843478f5f5741a7dab1ca/setup.py", line 168, in get_numpy_status
#5 4639.1           import numpy
#5 4639.1       ModuleNotFoundError: No module named 'numpy'
#5 4639.1       Traceback (most recent call last):
#5 4639.1         File "/tmp/pip-install-e25elt4h/scikit-learn_72ba595d87a843478f5f5741a7dab1ca/setup.py", line 148, in get_scipy_status
#5 4639.1           import scipy
#5 4639.1       ModuleNotFoundError: No module named 'scipy'
#5 4639.1       Traceback (most recent call last):
#5 4639.1         File "<string>", line 36, in <module>
#5 4639.1         File "<pip-setuptools-caller>", line 34, in <module>
#5 4639.1         File "/tmp/pip-install-e25elt4h/scikit-learn_72ba595d87a843478f5f5741a7dab1ca/setup.py", line 269, in <module>
#5 4639.1           setup_package()
#5 4639.1         File "/tmp/pip-install-e25elt4h/scikit-learn_72ba595d87a843478f5f5741a7dab1ca/setup.py", line 249, in setup_package
#5 4639.1           .format(numpy_req_str, instructions))
#5 4639.1       ImportError: Numerical Python (NumPy) is not installed.
#5 4639.1       scikit-learn requires NumPy >= 1.8.2.
#5 4639.1       Installation instructions are available on the scikit-learn website: http://scikit-learn.org/stable/install.html
#5 4639.1
#5 4639.1       [end of output]
#5 4639.1
#5 4639.1   note: This error originates from a subprocess, and is likely not a problem with pip.
#5 4639.1   ERROR: Failed building wheel for scikit-learn
#5 4639.1   Running setup.py clean for scikit-learn
#5 4641.8   Building wheel for spacy (pyproject.toml): started
#5 4702.0   Building wheel for spacy (pyproject.toml): still running...
#5 4789.3   Building wheel for spacy (pyproject.toml): still running...
#5 4900.7   Building wheel for spacy (pyproject.toml): still running...
#5 5005.4   Building wheel for spacy (pyproject.toml): still running...
#5 5078.8   Building wheel for spacy (pyproject.toml): still running...
#5 5452.7   Building wheel for spacy (pyproject.toml): still running...
#5 5536.8   Building wheel for spacy (pyproject.toml): still running...
#5 5602.9   Building wheel for spacy (pyproject.toml): still running...
#5 5715.9   Building wheel for spacy (pyproject.toml): still running...
#5 5882.9   Building wheel for spacy (pyproject.toml): still running...
#5 5981.8   Building wheel for spacy (pyproject.toml): still running...
#5 6084.2   Building wheel for spacy (pyproject.toml): still running...
#5 6169.2   Building wheel for spacy (pyproject.toml): still running...
#5 6242.9   Building wheel for spacy (pyproject.toml): still running...
#5 6351.0   Building wheel for spacy (pyproject.toml): still running...
#5 6540.8   Building wheel for spacy (pyproject.toml): still running...
#5 6706.1   Building wheel for spacy (pyproject.toml): still running...
#5 6804.6   Building wheel for spacy (pyproject.toml): still running...
#5 6892.5   Building wheel for spacy (pyproject.toml): still running...
#5 7017.4   Building wheel for spacy (pyproject.toml): still running...
#5 7130.1   Building wheel for spacy (pyproject.toml): still running...
#5 7221.7   Building wheel for spacy (pyproject.toml): still running...
#5 7309.7   Building wheel for spacy (pyproject.toml): still running...
#5 7454.2   Building wheel for spacy (pyproject.toml): still running...
#5 7525.7   Building wheel for spacy (pyproject.toml): still running...
#5 7553.1   Building wheel for spacy (pyproject.toml): finished with status 'done'
#5 7553.6   Created wheel for spacy: filename=spacy-2.3.7-cp37-cp37m-linux_aarch64.whl size=24359514 sha256=7f250c5ef6f98c4aaddfbd7c93e3e67e572e1fb23a0ade80e0b9ad7e977ecb33
#5 7553.6   Stored in directory: /root/.cache/pip/wheels/aa/99/63/f57e42849e2e628229458201f2d3e61896ed3cfe2fe0c339e3
#5 7553.7   Building wheel for pycountry (pyproject.toml): started
#5 7602.5   Building wheel for pycountry (pyproject.toml): finished with status 'done'
#5 7602.7   Created wheel for pycountry: filename=pycountry-22.3.5-py2.py3-none-any.whl size=10681832 sha256=c5bf60553a346b91e24894377a920933ec4ed39ef0b05cd5938dc3a3bfc206cf
#5 7602.7   Stored in directory: /root/.cache/pip/wheels/0e/06/e8/7ee176e95ea9a8a8c3b3afcb1869f20adbd42413d4611c6eb4
#5 7602.9 Successfully built future python-crfsuite spacy pycountry
#5 7602.9 Failed to build scikit-learn
#5 7610.0 Installing collected packages: wasabi, tabulate, srsly, scikit-learn, pytz, python-crfsuite, plac, mypy-extensions, murmurhash, cymem, zipp, Werkzeug, urllib3, typing-extensions, typed-ast, tqdm, tomli, Six, regex, pyyaml, pycountry, py, preshed, numpy, multidict, marshmallow, markupsafe, joblib, itsdangerous, idna, future, frozenlist, distro, Click, charset-normalizer, certifi, attrs, asynctest, yarl, sklearn-crfsuite, scipy, requests, python-dateutil, nltk, mypy, Jinja2, importlib-metadata, immutables, elasticsearch, click-log, blis, async-timeout, aiosignal, Flask, catalogue, aiohttp, thinc, Flask-Cors, spacy, mindmeld
#5 7611.0   Running setup.py install for scikit-learn: started
#5 7613.1   Running setup.py install for scikit-learn: finished with status 'error'
#5 7613.1   error: subprocess-exited-with-error
#5 7613.1
#5 7613.1   × Running setup.py install for scikit-learn did not run successfully.
#5 7613.1   │ exit code: 1
#5 7613.1   ╰─> [20 lines of output]
#5 7613.1       Partial import of sklearn during the build process.
#5 7613.1       Traceback (most recent call last):
#5 7613.1         File "/tmp/pip-install-e25elt4h/scikit-learn_72ba595d87a843478f5f5741a7dab1ca/setup.py", line 168, in get_numpy_status
#5 7613.1           import numpy
#5 7613.1       ModuleNotFoundError: No module named 'numpy'
#5 7613.1       Traceback (most recent call last):
#5 7613.1         File "/tmp/pip-install-e25elt4h/scikit-learn_72ba595d87a843478f5f5741a7dab1ca/setup.py", line 148, in get_scipy_status
#5 7613.1           import scipy
#5 7613.1       ModuleNotFoundError: No module named 'scipy'
#5 7613.1       Traceback (most recent call last):
#5 7613.1         File "<string>", line 36, in <module>
#5 7613.1         File "<pip-setuptools-caller>", line 34, in <module>
#5 7613.1         File "/tmp/pip-install-e25elt4h/scikit-learn_72ba595d87a843478f5f5741a7dab1ca/setup.py", line 269, in <module>
#5 7613.1           setup_package()
#5 7613.1         File "/tmp/pip-install-e25elt4h/scikit-learn_72ba595d87a843478f5f5741a7dab1ca/setup.py", line 249, in setup_package
#5 7613.1           .format(numpy_req_str, instructions))
#5 7613.1       ImportError: Numerical Python (NumPy) is not installed.
#5 7613.1       scikit-learn requires NumPy >= 1.8.2.
#5 7613.1       Installation instructions are available on the scikit-learn website: http://scikit-learn.org/stable/install.html
#5 7613.1
#5 7613.1       [end of output]
#5 7613.1
#5 7613.1   note: This error originates from a subprocess, and is likely not a problem with pip.
#5 7613.1 error: legacy-install-failure
#5 7613.1
#5 7613.1 × Encountered error while trying to install package.
#5 7613.1 ╰─> scikit-learn
#5 7613.1
#5 7613.1 note: This is an issue with the package mentioned above, not pip.
#5 7613.1 hint: See above for output from the failure.
#5 7613.2 WARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
#5 7613.2 You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
#5 ERROR: process "/bin/sh -c pip install mindmeld" did not complete successfully: exit code: 1
------
 > [2/2] RUN pip install mindmeld:
#5 7613.1   note: This error originates from a subprocess, and is likely not a problem with pip.
#5 7613.1 error: legacy-install-failure
#5 7613.1
#5 7613.1 × Encountered error while trying to install package.
#5 7613.1 ╰─> scikit-learn
#5 7613.1
#5 7613.1 note: This is an issue with the package mentioned above, not pip.
#5 7613.1 hint: See above for output from the failure.
#5 7613.2 WARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
#5 7613.2 You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.
------
Dockerfile:3
--------------------
   1 |     FROM python:3.7.13
   2 |
   3 | >>> RUN pip install mindmeld
--------------------
error: failed to solve: process "/bin/sh -c pip install mindmeld" did not complete successfully: exit code: 1

Possible Solutions

Going to try to play with it myself and see if I can hack together a configuration that works but not 100% sure what the best option is.

Zozman commented 2 years ago

Got the following to build in linux/arm64. Took 9000 seconds to install MindMeld but it worked:

FROM python:3.7.13

RUN pip install numpy~=1.15
RUN pip install mindmeld spacy

# Add English spacy model or else mindmeld will try to download it itself and fail
RUN python -m spacy download en_core_web_sm --default-timeout=1000

Going to see if I can put together a full example that builds.

Zozman commented 2 years ago

Attempting to build a modified home_assistant container directly on an arm system (a Raspberry Pi 3) is leading to this:

 > [21/21] RUN export LC_ALL=C.UTF-8 &&     export LANG=C.UTF-8 &&     su mindmeld -c "ES_JAVA_OPTS='-Xms1g -Xmx1g' /usr/share/elasticsearch/bin/elasticsearch -d" &&     mindmeld num-parse --start &&     python3 -m home_assistant build:
#25 23.26     return callback(*args, **kwargs)
#25 23.26   File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 21, in new_func
#25 23.26     return f(get_current_context(), *args, **kwargs)
#25 23.26   File "/usr/local/lib/python3.6/dist-packages/mindmeld/cli.py", line 816, in num_parser
#25 23.26     [exec_path, "--port", port], stderr=subprocess.STDOUT
#25 23.26   File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
#25 23.26     restore_signals, start_new_session)
#25 23.26   File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
#25 23.26     raise child_exception_type(errno_num, err_msg, err_filename)
#25 23.26 OSError: [Errno 8] Exec format error: '/usr/local/lib/python3.6/dist-packages/mindmeld/resources/duckling-x86_64-linux-ubuntu-20'

Looking here it appears that MindMeld automatically installs a version of duckling from https://binaries.mindmeld.com. Does it only have an x86 binary here?

Zozman commented 2 years ago

Looking at these mappings it appears that MindMeld only checks for x86 builds of duckling so even if I were to manually install it I think MindMeld would just try to re-install it and use an x86 build and then crash.