Closed fantauzzi closed 1 year ago
@fantauzzi Does /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/six.py
exist?
https://stackoverflow.com/a/71542588 might be worth a try
https://github.com/pypa/pipenv/issues/4804 reports the same issue, might be worth a look
@fantauzzi Does
/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/six.py
exist?
I had checked that, and it didn't. I don't have that set-up anymore, so I have reproduced the issue in a new set-up, and this is the content of the relevant directory in the new set-up after running mlflow run .
and getting the error:
$ ll /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/
total 20
drwxrwxr-x 3 ubuntu ubuntu 4096 Apr 24 06:00 ./
drwxrwxr-x 5 ubuntu ubuntu 4096 Apr 24 06:00 ../
-rw-rw-r-- 1 ubuntu ubuntu 4975 Apr 24 06:00 __init__.py
drwxrwxr-x 2 ubuntu ubuntu 4096 Apr 24 06:00 __pycache__/
It looks like the creation of the virtualenv got interrupted by the error, and therefore didn't complete. If I try to activate it from shell, I get this:
$ pyenv activate mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6
pyenv-virtualenv: version `mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6' is not a virtualenv
Also, it is not listed as a virtualenv:
$ pyenv virtualenvs
$
why is the vendor directory almost empty?
Can you run:
home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/__init__.py
and check the contents of __init__.py
?
why is the vendor directory almost empty?
I don't know.
The pip
installation I have in ~/.local/lib/python3.8/site-packages/pip
contains a number of files in its _vendor
, including six.py
.
But I noticed the system-wide /usr/lib/python3/dist-packages/pip
has an almost empty _vendor
, with no six.py
. I have therefore reinstalled the system-wide pip
, based on the instructions at one of the links you posted, with
curl -sS https://bootstrap.pypa.io/pip/3.6/get-pip.py | sudo pypy3
Now also the system-wide pip
has a crowded _vendor
directory, containing six.py
.
Unfortunately the error with mlflow run .
persists.
Where does mlflow run .
take its pip
from? Looks to me like it doesn't use neither the system-wide nor the one I have in ~/.local
.
Can you run:
home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/__init__.py
and check the contents of
__init__.py
?
I have run __init__.py
in _vendor
and got no output
$ python __init__.py
$
This is the content of the file now:
"""
pip._vendor is for vendoring dependencies of pip to prevent needing pip to
depend on something external.
Files inside of pip._vendor should be considered immutable and should only be
updated to versions from upstream.
"""
from __future__ import absolute_import
import glob
import os.path
import sys
# Downstream redistributors which have debundled our dependencies should also
# patch this value to be true. This will trigger the additional patching
# to cause things like "six" to be available as pip.
DEBUNDLED = True
# By default, look in this directory for a bunch of .whl files which we will
# add to the beginning of sys.path before attempting to import anything. This
# is done to support downstream re-distributors like Debian and Fedora who
# wish to create their own Wheels for our dependencies to aid in debundling.
prefix = getattr(sys, "base_prefix", sys.prefix)
if prefix.startswith('/usr/lib/pypy'):
prefix = '/usr'
WHEEL_DIR = os.path.abspath(os.path.join(prefix, 'share', 'python-wheels'))
# Define a small helper function to alias our vendored modules to the real ones
# if the vendored ones do not exist. This idea of this was taken from
# https://github.com/kennethreitz/requests/pull/2567.
def vendored(modulename):
vendored_name = "{0}.{1}".format(__name__, modulename)
try:
__import__(modulename, globals(), locals(), level=0)
except ImportError:
# We can just silently allow import failures to pass here. If we
# got to this point it means that ``import pip._vendor.whatever``
# failed and so did ``import whatever``. Since we're importing this
# upfront in an attempt to alias imports, not erroring here will
# just mean we get a regular import error whenever pip *actually*
# tries to import one of these modules to use it, which actually
# gives us a better error message than we would have otherwise
# gotten.
pass
else:
sys.modules[vendored_name] = sys.modules[modulename]
base, head = vendored_name.rsplit(".", 1)
setattr(sys.modules[base], head, sys.modules[modulename])
# If we're operating in a debundled setup, then we want to go ahead and trigger
# the aliasing of our vendored libraries as well as looking for wheels to add
# to our sys.path. This will cause all of this code to be a no-op typically
# however downstream redistributors can enable it in a consistent way across
# all platforms.
if DEBUNDLED:
# Actually look inside of WHEEL_DIR to find .whl files and add them to the
# front of our sys.path.
sys.path[:] = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path
# Actually alias all of our vendored dependencies.
vendored("appdirs")
vendored("cachecontrol")
vendored("colorama")
vendored("contextlib2")
vendored("distlib")
vendored("distro")
vendored("html5lib")
vendored("six")
vendored("six.moves")
vendored("six.moves.urllib")
vendored("six.moves.urllib.parse")
vendored("packaging")
vendored("packaging.version")
vendored("packaging.specifiers")
vendored("pep517")
vendored("pkg_resources")
vendored("progress")
vendored("retrying")
vendored("requests")
vendored("requests.exceptions")
vendored("requests.packages")
vendored("requests.packages.urllib3")
vendored("requests.packages.urllib3._collections")
vendored("requests.packages.urllib3.connection")
vendored("requests.packages.urllib3.connectionpool")
vendored("requests.packages.urllib3.contrib")
vendored("requests.packages.urllib3.contrib.ntlmpool")
vendored("requests.packages.urllib3.contrib.pyopenssl")
vendored("requests.packages.urllib3.exceptions")
vendored("requests.packages.urllib3.fields")
vendored("requests.packages.urllib3.filepost")
vendored("requests.packages.urllib3.packages")
try:
vendored("requests.packages.urllib3.packages.ordered_dict")
vendored("requests.packages.urllib3.packages.six")
except ImportError:
# Debian already unbundles these from requests.
pass
vendored("requests.packages.urllib3.packages.ssl_match_hostname")
vendored("requests.packages.urllib3.packages.ssl_match_hostname."
"_implementation")
vendored("requests.packages.urllib3.poolmanager")
vendored("requests.packages.urllib3.request")
vendored("requests.packages.urllib3.response")
vendored("requests.packages.urllib3.util")
vendored("requests.packages.urllib3.util.connection")
vendored("requests.packages.urllib3.util.request")
vendored("requests.packages.urllib3.util.response")
vendored("requests.packages.urllib3.util.retry")
vendored("requests.packages.urllib3.util.ssl_")
vendored("requests.packages.urllib3.util.timeout")
vendored("requests.packages.urllib3.util.url")
vendored("toml")
vendored("toml.encoder")
vendored("toml.decoder")
vendored("urllib3")
I have been working on the hypothesis that the pip
module used by mlflow run .
doesn't come bundled with a six
module. That is the root cause of the No module named 'pip._vendor.six
error also in the StackOverflow posts you linked.
So far I am not able to tell which pip
module mlflow run .
uses to make its own virtualenv.
The pip
I have under ~/.local
seems to be bundled with a six
module, as its _vendor
contains a six.py
The system-wide pip
in /usr/local/bin/pip
didn't seem to be bundled with a six
module, as its _vendor
was almost empty and with no six.py
, so I have re-installed it, and now it does have a populated _vendor
also with a six.py
.
The error with mlflow run .
persists.
@fantauzzi Can you remove home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6
and run mlflow run .
again?
The system-wide pip in /usr/local/bin/pip didn't seem to be bundled with a six module
Maybe virtualenv picked up this unbundled pip and hit the error?
@fantauzzi Feel free to open this issue if re-creating the virtual vironment with the bundled pip doesn't work.
@fantauzzi Can you remove
home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6
and runmlflow run .
again?The system-wide pip in /usr/local/bin/pip didn't seem to be bundled with a six module
Maybe virtualenv picked up this unbundled pip and hit the error?
After removing that and running mlflow run .
again, I get the same error again. In fact, to reproduce the error, I must remove the half-backed virtualenv from home/ubuntu/.mlflow/envs
before running mlflow run .
again.
If instead I run mlflow run .
again with the half-backed virtualenv still there, mlflow tries to use it and fails with a different error message (ModuleNotFoundError: No module named 'pmdarima'
). Here an example, scroll down to see the second attempt to run mlflow run .
:
(python3.10.7) ubuntu@192-9-242-164:~/mlflow/examples/diviner$ mlflow run .
2023/04/26 05:07:31 INFO mlflow.utils.virtualenv: Installing python 3.8.16 if it does not exist
Downloading Python-3.8.16.tar.xz...
-> https://www.python.org/ftp/python/3.8.16/Python-3.8.16.tar.xz
Installing Python-3.8.16...
Installed Python-3.8.16 to /home/ubuntu/.pyenv/versions/3.8.16
2023/04/26 05:08:12 INFO mlflow.utils.virtualenv: Creating a new environment in /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 with /home/ubuntu/.pyenv/versions/3.8.16/bin/python
created virtual environment CPython3.8.16.final.0-64 in 213ms
creator CPython3Posix(dest=/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6, clear=False, global=False)
seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
2023/04/26 05:08:13 INFO mlflow.utils.virtualenv: Installing dependencies
Traceback (most recent call last):
File "/home/ubuntu/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/ubuntu/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/__main__.py", line 16, in <module>
from pip._internal.cli.main import main as _main # isort:skip # noqa
File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/main.py", line 10, in <module>
from pip._internal.cli.autocompletion import autocomplete
File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/autocompletion.py", line 9, in <module>
from pip._internal.cli.main_parser import create_main_parser
File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/main_parser.py", line 7, in <module>
from pip._internal.cli import cmdoptions
File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/cmdoptions.py", line 24, in <module>
from pip._internal.exceptions import CommandError
File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/exceptions.py", line 10, in <module>
from pip._vendor.six import iteritems
ModuleNotFoundError: No module named 'pip._vendor.six'
Traceback (most recent call last):
File "/home/ubuntu/.pyenv/versions/python3.10.7/bin/mlflow", line 8, in <module>
sys.exit(cli())
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/cli.py", line 202, in run
projects.run(
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/projects/__init__.py", line 338, in run
submitted_run_obj = _run(
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/projects/__init__.py", line 105, in _run
submitted_run = backend.run(
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/projects/backend/local.py", line 167, in run
activate_cmd = _create_virtualenv(
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/utils/virtualenv.py", line 269, in _create_virtualenv
_exec_cmd(cmd, capture_output=capture_output, cwd=tmp_model_dir, extra_env=extra_env)
File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/utils/process.py", line 117, in _exec_cmd
raise ShellCommandException.from_completed_process(comp_process)
mlflow.utils.process.ShellCommandException: Non-zero exit code: 1
Command: ['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/bin/activate && python -m pip install --quiet -r requirements.1ff80735675c49ec986a3849caf7600f.txt']
(python3.10.7) ubuntu@192-9-242-164:~/mlflow/examples/diviner$ mlflow run .
2023/04/26 05:09:52 INFO mlflow.utils.virtualenv: Installing python 3.8.16 if it does not exist
2023/04/26 05:09:52 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 already exists
2023/04/26 05:09:52 INFO mlflow.projects.utils: === Created directory /tmp/tmpg3aq2q9y for downloading remote URIs passed to arguments of type 'path' ===
2023/04/26 05:09:52 INFO mlflow.projects.backend.local: === Running command 'source /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/bin/activate && python train.py' in run with ID '17ad76248b7b4f699d2b67762e9c1cb1' ===
Traceback (most recent call last):
File "train.py", line 2, in <module>
from pmdarima import datasets
ModuleNotFoundError: No module named 'pmdarima'
2023/04/26 05:09:52 ERROR mlflow.cli: === Run (ID '17ad76248b7b4f699d2b67762e9c1cb1') failed ===
(python3.10.7) ubuntu@192-9-242-164:~/mlflow/examples/diviner$
@fantauzzi Feel free to open this issue if re-creating the virtual vironment with the bundled pip doesn't work.
I don't think I am allowed, as a collaborator closed it.
@fantauzzi I ran mlflow run .
and got the following:
> mlflow run .
2023/04/26 14:39:36 INFO mlflow.utils.virtualenv: Installing python 3.8.16 if it does not exist
2023/04/26 14:39:36 INFO mlflow.utils.virtualenv: Creating a new environment in /home/haru/.mlflow/envs/mlflow-0c30c75c1190ace1070fa325506dbb84f690a784 with /home/haru/.pyenv/versions/3.8.16/bin/python
created virtual environment CPython3.8.16.final.0-64 in 205ms
creator CPython3Posix(dest=/home/haru/.mlflow/envs/mlflow-0c30c75c1190ace1070fa325506dbb84f690a784, clear=False, no_vcs_ignore=False, global=False)
seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/haru/.local/share/virtualenv)
^ 👆 is different
added seed packages: pip==23.0.1, setuptools==67.6.1, wheel==0.40.0
Can you try removing /home/ubuntu/.local/share/virtualenv
?
Because I reproduce the error in a cloud setup with Ubuntu 20.04, but don't reproduce it on my PC with Ubuntu 22.04, I have compared the output of mlflow run .
between the two, and spotted an interesting difference. I copy and paste here the relevant output and then outline the difference.
Here on the setup that reproduces the error:
$ mlflow run .
2023/04/26 05:07:31 INFO mlflow.utils.virtualenv: Installing python 3.8.16 if it does not exist
Downloading Python-3.8.16.tar.xz...
-> https://www.python.org/ftp/python/3.8.16/Python-3.8.16.tar.xz
Installing Python-3.8.16...
Installed Python-3.8.16 to /home/ubuntu/.pyenv/versions/3.8.16
2023/04/26 05:08:12 INFO mlflow.utils.virtualenv: Creating a new environment in /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 with /home/ubuntu/.pyenv/versions/3.8.16/bin/python
created virtual environment CPython3.8.16.final.0-64 in 213ms
creator CPython3Posix(dest=/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6, clear=False, global=False)
seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
2023/04/26 05:08:13 INFO mlflow.utils.virtualenv: Installing dependencies
Traceback (most recent call last):
[...]
Here on the setup that doesn't reproduce the error, and the example code completes correctly
-> https://www.python.org/ftp/python/3.8.16/Python-3.8.16.tar.xz
Installing Python-3.8.16...
Installed Python-3.8.16 to /home/fanta/.pyenv/versions/3.8.16
2023/04/26 07:29:26 INFO mlflow.utils.virtualenv: Creating a new environment in /home/fanta/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 with /home/fanta/.pyenv/versions/3.8.16/bin/python
created virtual environment CPython3.8.16.final.0-64 in 189ms
creator CPython3Posix(dest=/home/fanta/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6, clear=False, no_vcs_ignore=False, global=False)
seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/fanta/.local/share/virtualenv)
added seed packages: pip==22.3.1, setuptools==65.6.3, wheel==0.38.4
activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
2023/04/26 07:29:26 INFO mlflow.utils.virtualenv: Installing dependencies
[...]
I believe the relevant difference is in the seeder FromAppData(
part, which in the working setup contains pip=bundle, setuptools=bundle, wheel=bundle
while in the faulty setup contains pip=latest, setuptools=latest, wheel=latest
MLFlow seems to be using a bundled pip on one system, and a different (non bundled) pip on another system.
Can you try removing /home/ubuntu/.local/share/virtualenv?
Have you tried this?
/home/ubuntu/.local/share/virtualenv
I was typing in this issue at the same time as you and noticed later you had updated it. So now I have removed that directory, removed /home/ubuntu/.mlflow/envs/mlflow*
, run mlflow run .
again, and got the usual error. Also, mlflow run .
has recreated the directory I had just removed, with this content:
$ ll /home/ubuntu/.local/share/virtualenv
total 16
drwxrwxr-x 4 ubuntu ubuntu 4096 Apr 26 06:05 ./
drwxrwxr-x 7 ubuntu ubuntu 4096 Apr 26 06:05 ../
drwxrwxr-x 3 ubuntu ubuntu 4096 Apr 26 06:05 py_info/
drwxrwxr-x 3 ubuntu ubuntu 4096 Apr 26 06:05 seed-app-data/
@fantauzzi Can you reproduce the error by directly running virtualenv
to create an environment and running pip install
in it?
Yes.
$ virtualenv --version
virtualenv 20.0.17 from /usr/lib/python3/dist-packages/virtualenv/__init__.py
$ virtualenv --python /home/ubuntu/.pyenv/versions/3.10.7/bin/python /home/ubuntu/my_virtualenv
created virtual environment CPython3.10.7.final.0-64 in 89ms
creator CPython3Posix(dest=/home/ubuntu/my_virtualenv, clear=False, global=False)
seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
$ source /home/ubuntu/my_virtualenv/bin/activate
$ pip --version
Traceback (most recent call last):
File "/home/ubuntu/my_virtualenv/bin/pip", line 5, in <module>
from pip._internal.cli.main import main
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/main.py", line 10, in <module>
from pip._internal.cli.autocompletion import autocomplete
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/autocompletion.py", line 9, in <module>
from pip._internal.cli.main_parser import create_main_parser
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/main_parser.py", line 7, in <module>
from pip._internal.cli import cmdoptions
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/cmdoptions.py", line 24, in <module>
from pip._internal.exceptions import CommandError
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/exceptions.py", line 10, in <module>
from pip._vendor.six import iteritems
ModuleNotFoundError: No module named 'pip._vendor.six'
$
Can you run python -c 'import six'
?
Can you run
python -c 'import six'
?
From within the virtualenv I have just created, I cannot import six
:
$ python -c 'import six'
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'six'
$
@fantauzzi Does /usr/share/python-wheels
exist?
/usr/share/python-wheels
Yes
$ ll /usr/share/python-wheels
total 2260
drwxr-xr-x 2 root root 12288 Apr 26 04:47 ./
drwxr-xr-x 177 root root 4096 Apr 26 04:50 ../
-rw-r--r-- 1 root root 28023 Feb 28 09:41 CacheControl-0.12.6-py2.py3-none-any.whl
-rw-r--r-- 1 root root 18776 Feb 28 09:41 appdirs-1.4.3-py2.py3-none-any.whl
-rw-r--r-- 1 root root 164552 Feb 28 09:41 certifi-2019.11.28-py2.py3-none-any.whl
-rw-r--r-- 1 root root 141487 Feb 28 09:41 chardet-3.0.4-py2.py3-none-any.whl
-rw-r--r-- 1 root root 25094 Feb 28 09:41 colorama-0.4.3-py2.py3-none-any.whl
-rw-r--r-- 1 root root 17188 Feb 28 09:41 contextlib2-0.6.0-py2.py3-none-any.whl
-rw-r--r-- 1 root root 152027 Feb 28 09:41 distlib-0.3.0-py2.py3-none-any.whl
-rw-r--r-- 1 root root 23898 Feb 28 09:41 distro-1.4.0-py2.py3-none-any.whl
-rw-r--r-- 1 root root 120020 Feb 28 09:41 html5lib-1.0.1-py2.py3-none-any.whl
-rw-r--r-- 1 root root 66836 Feb 28 09:41 idna-2.8-py2.py3-none-any.whl
-rw-r--r-- 1 root root 24287 Feb 28 09:41 ipaddr-2.2.0-py2.py3-none-any.whl
-rw-r--r-- 1 root root 21972 Feb 28 09:41 lockfile-0.12.2-py2.py3-none-any.whl
-rw-r--r-- 1 root root 92927 Feb 28 09:41 msgpack-0.6.2-py2.py3-none-any.whl
-rw-r--r-- 1 root root 42242 Feb 28 09:41 packaging-20.3-py2.py3-none-any.whl
-rw-r--r-- 1 root root 26686 Feb 28 09:41 pep517-0.8.2-py2.py3-none-any.whl
-rw-r--r-- 1 root root 262440 Feb 28 09:41 pip-20.0.2-py2.py3-none-any.whl
-rw-r--r-- 1 root root 127312 Feb 28 09:41 pkg_resources-0.0.0-py2.py3-none-any.whl
-rw-r--r-- 1 root root 17547 Feb 28 09:41 progress-1.5-py2.py3-none-any.whl
-rw-r--r-- 1 root root 77093 Feb 28 09:41 pyparsing-2.4.6-py2.py3-none-any.whl
-rw-r--r-- 1 root root 67470 Feb 28 09:41 requests-2.22.0-py2.py3-none-any.whl
-rw-r--r-- 1 root root 16358 Feb 28 09:41 retrying-1.3.3-py2.py3-none-any.whl
-rw-r--r-- 1 root root 477455 Feb 28 09:41 setuptools-44.0.0-py2.py3-none-any.whl
-rw-r--r-- 1 root root 20256 Feb 28 09:41 six-1.14.0-py2.py3-none-any.whl
-rw-r--r-- 1 root root 24106 Feb 28 09:41 toml-0.10.0-py2.py3-none-any.whl
-rw-r--r-- 1 root root 127068 Feb 28 09:41 urllib3-1.25.8-py2.py3-none-any.whl
-rw-r--r-- 1 root root 20484 Feb 28 09:41 webencodings-0.5.1-py2.py3-none-any.whl
-rw-r--r-- 1 root root 35613 Feb 28 09:41 wheel-0.34.2-py2.py3-none-any.whl
Can you modify _vendor/__init__.py
to see why it failed to import six? You can insert print
in the vendored
function.
def vendored(modulename):
vendored_name = "{0}.{1}".format(__name__, modulename)
try:
__import__(modulename, globals(), locals(), level=0)
except ImportError as e:
# We can just silently allow import failures to pass here. If we
# got to this point it means that ``import pip._vendor.whatever``
# failed and so did ``import whatever``. Since we're importing this
# upfront in an attempt to alias imports, not erroring here will
# just mean we get a regular import error whenever pip *actually*
# tries to import one of these modules to use it, which actually
# gives us a better error message than we would have otherwise
# gotten.
print(e)
else:
sys.modules[vendored_name] = sys.modules[modulename]
base, head = vendored_name.rsplit(".", 1)
setattr(sys.modules[base], head, sys.modules[modulename])
Can you modify
_vendor/__init__.py
to see why it failed to import six? You can insertvendored
function.
I have modified file /home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_vendor/__init__.py
, this is the output
$ pip --version
No module named 'appdirs'
No module named 'cachecontrol'
No module named 'colorama'
No module named 'contextlib2'
No module named 'distlib'
No module named 'distro'
No module named 'html5lib'
No module named 'six'
No module named 'six'
No module named 'six'
No module named 'six'
No module named 'packaging'
No module named 'packaging'
No module named 'packaging'
No module named 'pep517'
No module named 'progress'
No module named 'retrying'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'toml'
No module named 'toml'
No module named 'toml'
No module named 'urllib3'
Traceback (most recent call last):
File "/home/ubuntu/my_virtualenv/bin/pip", line 5, in <module>
from pip._internal.cli.main import main
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/main.py", line 10, in <module>
from pip._internal.cli.autocompletion import autocomplete
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/autocompletion.py", line 9, in <module>
from pip._internal.cli.main_parser import create_main_parser
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/main_parser.py", line 7, in <module>
from pip._internal.cli import cmdoptions
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/cmdoptions.py", line 24, in <module>
from pip._internal.exceptions import CommandError
File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/exceptions.py", line 10, in <module>
from pip._vendor.six import iteritems
ModuleNotFoundError: No module named 'pip._vendor.six'
$
Can you print the value of WHEEL_DIR
?
Can you print the value of
WHEEL_DIR
?
The env. variable WHEEL_DIR
is not set, neither in the shell nor in __init_.py
(I had it printed right after print(e)
).
What does your _vendor/__init__.py
look like?
What does your
_vendor/__init__.py
look like?
This is pip/_vendor/__init__.py
pip._vendor is for vendoring dependencies of pip to prevent needing pip to
depend on something external.
Files inside of pip._vendor should be considered immutable and should only be
updated to versions from upstream.
"""
from __future__ import absolute_import
import glob
import os.path
import sys
# Downstream redistributors which have debundled our dependencies should also
# patch this value to be true. This will trigger the additional patching
# to cause things like "six" to be available as pip.
DEBUNDLED = True
# By default, look in this directory for a bunch of .whl files which we will
# add to the beginning of sys.path before attempting to import anything. This
# is done to support downstream re-distributors like Debian and Fedora who
# wish to create their own Wheels for our dependencies to aid in debundling.
prefix = getattr(sys, "base_prefix", sys.prefix)
if prefix.startswith('/usr/lib/pypy'):
prefix = '/usr'
WHEEL_DIR = os.path.abspath(os.path.join(prefix, 'share', 'python-wheels'))
# Define a small helper function to alias our vendored modules to the real ones
# if the vendored ones do not exist. This idea of this was taken from
# https://github.com/kennethreitz/requests/pull/2567.
def vendored(modulename):
vendored_name = "{0}.{1}".format(__name__, modulename)
try:
__import__(modulename, globals(), locals(), level=0)
except ImportError as e:
# We can just silently allow import failures to pass here. If we
# got to this point it means that ``import pip._vendor.whatever``
# failed and so did ``import whatever``. Since we're importing this
# upfront in an attempt to alias imports, not erroring here will
# just mean we get a regular import error whenever pip *actually*
# tries to import one of these modules to use it, which actually
# gives us a better error message than we would have otherwise
# gotten.
print(f"WHEEL_DIR is {print(os.environ.get('WHEEL_DIR', None))}")
print(e)
else:
sys.modules[vendored_name] = sys.modules[modulename]
base, head = vendored_name.rsplit(".", 1)
setattr(sys.modules[base], head, sys.modules[modulename])
# If we're operating in a debundled setup, then we want to go ahead and trigger
# the aliasing of our vendored libraries as well as looking for wheels to add
# to our sys.path. This will cause all of this code to be a no-op typically
# however downstream redistributors can enable it in a consistent way across
# all platforms.
if DEBUNDLED:
# Actually look inside of WHEEL_DIR to find .whl files and add them to the
# front of our sys.path.
sys.path[:] = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path
# Actually alias all of our vendored dependencies.
vendored("appdirs")
vendored("cachecontrol")
vendored("colorama")
vendored("contextlib2")
vendored("distlib")
vendored("distro")
vendored("html5lib")
vendored("six")
vendored("six.moves")
vendored("six.moves.urllib")
vendored("six.moves.urllib.parse")
vendored("packaging")
vendored("packaging.version")
vendored("packaging.specifiers")
vendored("pep517")
vendored("pkg_resources")
vendored("progress")
vendored("retrying")
vendored("requests")
vendored("requests.exceptions")
vendored("requests.packages")
vendored("requests.packages.urllib3")
vendored("requests.packages.urllib3._collections")
vendored("requests.packages.urllib3.connection")
vendored("requests.packages.urllib3.connectionpool")
vendored("requests.packages.urllib3.contrib")
vendored("requests.packages.urllib3.contrib.ntlmpool")
vendored("requests.packages.urllib3.contrib.pyopenssl")
vendored("requests.packages.urllib3.exceptions")
vendored("requests.packages.urllib3.fields")
vendored("requests.packages.urllib3.filepost")
vendored("requests.packages.urllib3.packages")
try:
vendored("requests.packages.urllib3.packages.ordered_dict")
vendored("requests.packages.urllib3.packages.six")
except ImportError:
# Debian already unbundles these from requests.
pass
vendored("requests.packages.urllib3.packages.ssl_match_hostname")
vendored("requests.packages.urllib3.packages.ssl_match_hostname."
"_implementation")
vendored("requests.packages.urllib3.poolmanager")
vendored("requests.packages.urllib3.request")
vendored("requests.packages.urllib3.response")
vendored("requests.packages.urllib3.util")
vendored("requests.packages.urllib3.util.connection")
vendored("requests.packages.urllib3.util.request")
vendored("requests.packages.urllib3.util.response")
vendored("requests.packages.urllib3.util.retry")
vendored("requests.packages.urllib3.util.ssl_")
vendored("requests.packages.urllib3.util.timeout")
vendored("requests.packages.urllib3.util.url")
vendored("toml")
vendored("toml.encoder")
vendored("toml.decoder")
vendored("urllib3")
Can you run print(WHEEL_DIR)
?
In _vendor/__init__.py
, you see WHEEL_DIR
, right?
In
_vendor/__init__.py
, you seeWHEEL_DIR
, right?
Got it.
WHEEL_DIR is /home/ubuntu/.pyenv/versions/3.10.7/share/python-wheels
What is the output of ls /home/ubuntu/.pyenv/versions/3.10.7/share/python-wheels
?
What is the output of
ls /home/ubuntu/.pyenv/versions/3.10.7/share/python-wheels
?
That directory doesn't exist. This is the content of the parent directory
$ ls -laF /home/ubuntu/.pyenv/versions/3.10.7/share
total 12
drwxr-xr-x 3 ubuntu ubuntu 4096 Apr 26 05:01 ./
drwxr-xr-x 7 ubuntu ubuntu 4096 Apr 26 05:02 ../
drwxr-xr-x 3 ubuntu ubuntu 4096 Apr 26 05:01 man/
There is no six. I'm not sure why six is missing.
Maybe reinstalling pyenv might help.
Maybe reinstalling pyenv might help.
I have done it multiple times, also today. Every time I have started a new compute instance in a cloud (Lambda Cloud), and then reinstalled pyenv and mlflow.
This looks like a pyenv issue. Can you file an issue in https://github.com/pyenv/pyenv?
Seems to me it might be related just to virtualenv. On the setup that reproduces the issue, with no pyenv
installed, I already get this: in spite of the fact that the pip installation I am running contains a _vendor/six.py
, the virtualenv I made with virtualenv
doesn't contain it.
ubuntu@192-9-133-158:~$ which python
/usr/bin/python
ubuntu@192-9-133-158:~$ which pip
/home/ubuntu/.local/bin/pip
ubuntu@192-9-133-158:~$ which virtualenv
/usr/bin/virtualenv
ubuntu@192-9-133-158:~$ python --version
Python 3.8.10
ubuntu@192-9-133-158:~$ pip --version
pip 22.3 from /home/ubuntu/.local/lib/python3.8/site-packages/pip (python 3.8)
ubuntu@192-9-133-158:~$ ll /home/ubuntu/.local/lib/python3.8/site-packages/pip/_vendor/six.py
-rw-rw-r-- 1 ubuntu ubuntu 34549 Oct 26 22:37 /home/ubuntu/.local/lib/python3.8/site-packages/pip/_vendor/six.py
ubuntu@192-9-133-158:~$
ubuntu@192-9-133-158:~$ virtualenv --python /usr/bin/python ~/enva
created virtual environment CPython3.8.10.final.0-64 in 166ms
creator CPython3Posix(dest=/home/ubuntu/enva, clear=False, global=False)
seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
ubuntu@192-9-133-158:~$ . ~/enva/bin/activate
(enva) ubuntu@192-9-133-158:~$ which pip
/home/ubuntu/enva/bin/pip
(enva) ubuntu@192-9-133-158:~$ which python
/home/ubuntu/enva/bin/python
(enva) ubuntu@192-9-133-158:~$ pip --version
pip 20.0.2 from /home/ubuntu/enva/lib/python3.8/site-packages/pip (python 3.8)
(enva) ubuntu@192-9-133-158:~$ ll /home/ubuntu/enva/lib/python3.8/site-packages/pip/_vendor
total 20
drwxrwxr-x 3 ubuntu ubuntu 4096 Apr 26 11:04 ./
drwxrwxr-x 5 ubuntu ubuntu 4096 Apr 26 11:04 ../
-rw-rw-r-- 1 ubuntu ubuntu 4975 Apr 26 11:02 __init__.py
drwxrwxr-x 2 ubuntu ubuntu 4096 Apr 26 11:04 __pycache__/
The line
seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
references an app_data_dir
that didn't exist, and virtualenv
has created itself.
On the PC where mlflow works correctly I get this instead
⮕ virtualenv --python /usr/bin/python3 ~/enva
created virtual environment CPython3.10.6.final.0-64 in 69ms
creator CPython3Posix(dest=/home/fanta/enva, clear=False, no_vcs_ignore=False, global=False)
seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/fanta/.local/share/virtualenv)
added seed packages: pip==22.0.2, setuptools==59.6.0, wheel==0.37.1
activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
Got it, then can you report that in https://github.com/pypa/virtualenv?
Maybe something is wrong with Ubuntu instances that Lambda Cloud provides. Can you reproduce this issue using Docker?
Issues Policy acknowledgement
Willingness to contribute
No. I cannot contribute a bug fix at this time.
MLflow version
System information
Describe the problem
When I try to run an MLFlow project based on pyenv+virtualenv, I get the following error and the run stops:
Steps to reproduce with one of MLFlow examples:
mlflow/examples/diviner
mlflow run .
I think the error may be related to
pip
coming without asix
module bundled (see https://bnikolic.co.uk/blog/python/pip/2022/02/21/vendored-six.html ). Issue is, thepip
I have installed does come bundled withsix
, and I get the error only when running an MLFlow project.My system-wide
pip
does come with asix.py
, located in:Note that if I run the project with
mlflow run . --env-manager local
then the project runs OK. Also, I can make a virtualenv myself withpyenv virtualenv
from the shell, that works as expected, e.g.I have a different system running Ubuntu 22.04, and on that I cannot reproduce the problem.
Tracking information
Code to reproduce issue
Stack trace
Other info / logs
What component(s) does this bug affect?
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/recipes
: Recipes, Recipe APIs, Recipe configs, Recipe Templatesarea/projects
: MLproject format, project running backendsarea/scoring
: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra
: MLflow Tracking server backendarea/tracking
: Tracking Service, tracking client APIs, autologgingWhat interface(s) does this bug affect?
area/uiux
: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportWhat language(s) does this bug affect?
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesWhat integration(s) does this bug affect?
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrations