mlflow / mlflow

Open source platform for the machine learning lifecycle
https://mlflow.org
Apache License 2.0
18.65k stars 4.22k forks source link

[BUG] Getting "No module named 'pip._vendor.six'" error when running MLFlow example, or any MLFlow project with virtualenv #8306

Closed fantauzzi closed 1 year ago

fantauzzi commented 1 year ago

Issues Policy acknowledgement

Willingness to contribute

No. I cannot contribute a bug fix at this time.

MLflow version

System information

Describe the problem

When I try to run an MLFlow project based on pyenv+virtualenv, I get the following error and the run stops:

ModuleNotFoundError: No module named 'pip._vendor.six'

Steps to reproduce with one of MLFlow examples:

I think the error may be related to pip coming without a six module bundled (see https://bnikolic.co.uk/blog/python/pip/2022/02/21/vendored-six.html ). Issue is, the pip I have installed does come bundled with six, and I get the error only when running an MLFlow project.

My system-wide pip does come with a six.py, located in:

/home/ubuntu/.local/lib/python3.8/site-packages/pip/_vendor/six.py

Note that if I run the project with mlflow run . --env-manager local then the project runs OK. Also, I can make a virtualenv myself with pyenv virtualenv from the shell, that works as expected, e.g.

$ pyenv install 3.10.7
Downloading Python-3.10.7.tar.xz...
-> https://www.python.org/ftp/python/3.10.7/Python-3.10.7.tar.xz
Installing Python-3.10.7...
Installed Python-3.10.7 to /home/ubuntu/.pyenv/versions/3.10.7```
$ pyenv virtualenv 3.10.7 python3.10.7
$ 

I have a different system running Ubuntu 22.04, and on that I cannot reproduce the problem.

Tracking information

REPLACE_ME

Code to reproduce issue

https://github.com/mlflow/mlflow/tree/master/examples/diviner

Stack trace

$ mlflow run .
/home/ubuntu/.local/lib/python3.8/site-packages/pandas/core/computation/expressions.py:20: UserWarning: Pandas requires version '2.7.3' or newer of 'numexpr' (version '2.7.1' currently installed).
  from pandas.core.computation.check import NUMEXPR_INSTALLED
2023/04/23 19:29:54 INFO mlflow.utils.virtualenv: Installing python 3.8.16 if it does not exist
Downloading Python-3.8.16.tar.xz...
-> https://www.python.org/ftp/python/3.8.16/Python-3.8.16.tar.xz
Installing Python-3.8.16...
Installed Python-3.8.16 to /home/ubuntu/.pyenv/versions/3.8.16
2023/04/23 19:30:35 INFO mlflow.utils.virtualenv: Creating a new environment in /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 with /home/ubuntu/.pyenv/versions/3.8.16/bin/python
created virtual environment CPython3.8.16.final.0-64 in 172ms
  creator CPython3Posix(dest=/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6, clear=False, global=False)
  seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
2023/04/23 19:30:35 INFO mlflow.utils.virtualenv: Installing dependencies
Traceback (most recent call last):
  File "/home/ubuntu/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/ubuntu/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/__main__.py", line 16, in <module>
    from pip._internal.cli.main import main as _main  # isort:skip # noqa
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/main.py", line 10, in <module>
    from pip._internal.cli.autocompletion import autocomplete
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/autocompletion.py", line 9, in <module>
    from pip._internal.cli.main_parser import create_main_parser
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/main_parser.py", line 7, in <module>
    from pip._internal.cli import cmdoptions
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/cmdoptions.py", line 24, in <module>
    from pip._internal.exceptions import CommandError
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/exceptions.py", line 10, in <module>
    from pip._vendor.six import iteritems
ModuleNotFoundError: No module named 'pip._vendor.six'
Traceback (most recent call last):
  File "/home/ubuntu/.local/bin/mlflow", line 8, in <module>
    sys.exit(cli())
  File "/home/ubuntu/.local/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu/.local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/mlflow/cli.py", line 202, in run
    projects.run(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/mlflow/projects/__init__.py", line 338, in run
    submitted_run_obj = _run(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/mlflow/projects/__init__.py", line 105, in _run
    submitted_run = backend.run(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/mlflow/projects/backend/local.py", line 167, in run
    activate_cmd = _create_virtualenv(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/mlflow/utils/virtualenv.py", line 269, in _create_virtualenv
    _exec_cmd(cmd, capture_output=capture_output, cwd=tmp_model_dir, extra_env=extra_env)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/mlflow/utils/process.py", line 117, in _exec_cmd
    raise ShellCommandException.from_completed_process(comp_process)
mlflow.utils.process.ShellCommandException: Non-zero exit code: 1
Command: ['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/bin/activate && python -m pip install --quiet -r requirements.2ae8efaef9184961a2be2bf10f95fb98.txt']
$

Other info / logs

N/A

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

harupy commented 1 year ago

@fantauzzi Does /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/six.py exist?

harupy commented 1 year ago

https://stackoverflow.com/a/71542588 might be worth a try

harupy commented 1 year ago

https://github.com/pypa/pipenv/issues/4804 reports the same issue, might be worth a look

fantauzzi commented 1 year ago

@fantauzzi Does /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/six.py exist?

I had checked that, and it didn't. I don't have that set-up anymore, so I have reproduced the issue in a new set-up, and this is the content of the relevant directory in the new set-up after running mlflow run . and getting the error:

$ ll /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/
total 20
drwxrwxr-x 3 ubuntu ubuntu 4096 Apr 24 06:00 ./
drwxrwxr-x 5 ubuntu ubuntu 4096 Apr 24 06:00 ../
-rw-rw-r-- 1 ubuntu ubuntu 4975 Apr 24 06:00 __init__.py
drwxrwxr-x 2 ubuntu ubuntu 4096 Apr 24 06:00 __pycache__/

It looks like the creation of the virtualenv got interrupted by the error, and therefore didn't complete. If I try to activate it from shell, I get this:

$ pyenv activate mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6
pyenv-virtualenv: version `mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6' is not a virtualenv

Also, it is not listed as a virtualenv:

$ pyenv virtualenvs 
$ 
harupy commented 1 year ago

why is the vendor directory almost empty?

harupy commented 1 year ago

Can you run:

home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/__init__.py

and check the contents of __init__.py?

fantauzzi commented 1 year ago

why is the vendor directory almost empty?

I don't know.

The pip installation I have in ~/.local/lib/python3.8/site-packages/pip contains a number of files in its _vendor, including six.py.

But I noticed the system-wide /usr/lib/python3/dist-packages/pip has an almost empty _vendor, with no six.py. I have therefore reinstalled the system-wide pip, based on the instructions at one of the links you posted, with

curl -sS  https://bootstrap.pypa.io/pip/3.6/get-pip.py | sudo pypy3

Now also the system-wide pip has a crowded _vendor directory, containing six.py.

Unfortunately the error with mlflow run . persists.

Where does mlflow run . take its pip from? Looks to me like it doesn't use neither the system-wide nor the one I have in ~/.local.

fantauzzi commented 1 year ago

Can you run:

home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_vendor/__init__.py

and check the contents of __init__.py?

I have run __init__.py in _vendor and got no output

$ python __init__.py 
$ 

This is the content of the file now:

"""
pip._vendor is for vendoring dependencies of pip to prevent needing pip to
depend on something external.

Files inside of pip._vendor should be considered immutable and should only be
updated to versions from upstream.
"""
from __future__ import absolute_import

import glob
import os.path
import sys

# Downstream redistributors which have debundled our dependencies should also
# patch this value to be true. This will trigger the additional patching
# to cause things like "six" to be available as pip.
DEBUNDLED = True

# By default, look in this directory for a bunch of .whl files which we will
# add to the beginning of sys.path before attempting to import anything. This
# is done to support downstream re-distributors like Debian and Fedora who
# wish to create their own Wheels for our dependencies to aid in debundling.
prefix = getattr(sys, "base_prefix", sys.prefix)
if prefix.startswith('/usr/lib/pypy'):
    prefix = '/usr'
WHEEL_DIR = os.path.abspath(os.path.join(prefix, 'share', 'python-wheels'))

# Define a small helper function to alias our vendored modules to the real ones
# if the vendored ones do not exist. This idea of this was taken from
# https://github.com/kennethreitz/requests/pull/2567.
def vendored(modulename):
    vendored_name = "{0}.{1}".format(__name__, modulename)

    try:
        __import__(modulename, globals(), locals(), level=0)
    except ImportError:
        # We can just silently allow import failures to pass here. If we
        # got to this point it means that ``import pip._vendor.whatever``
        # failed and so did ``import whatever``. Since we're importing this
        # upfront in an attempt to alias imports, not erroring here will
        # just mean we get a regular import error whenever pip *actually*
        # tries to import one of these modules to use it, which actually
        # gives us a better error message than we would have otherwise
        # gotten.
        pass
    else:
        sys.modules[vendored_name] = sys.modules[modulename]
        base, head = vendored_name.rsplit(".", 1)
        setattr(sys.modules[base], head, sys.modules[modulename])

# If we're operating in a debundled setup, then we want to go ahead and trigger
# the aliasing of our vendored libraries as well as looking for wheels to add
# to our sys.path. This will cause all of this code to be a no-op typically
# however downstream redistributors can enable it in a consistent way across
# all platforms.
if DEBUNDLED:
    # Actually look inside of WHEEL_DIR to find .whl files and add them to the
    # front of our sys.path.
    sys.path[:] = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path

    # Actually alias all of our vendored dependencies.
    vendored("appdirs")
    vendored("cachecontrol")
    vendored("colorama")
    vendored("contextlib2")
    vendored("distlib")
    vendored("distro")
    vendored("html5lib")
    vendored("six")
    vendored("six.moves")
    vendored("six.moves.urllib")
    vendored("six.moves.urllib.parse")
    vendored("packaging")
    vendored("packaging.version")
    vendored("packaging.specifiers")
    vendored("pep517")
    vendored("pkg_resources")
    vendored("progress")
    vendored("retrying")
    vendored("requests")
    vendored("requests.exceptions")
    vendored("requests.packages")
    vendored("requests.packages.urllib3")
    vendored("requests.packages.urllib3._collections")
    vendored("requests.packages.urllib3.connection")
    vendored("requests.packages.urllib3.connectionpool")
    vendored("requests.packages.urllib3.contrib")
    vendored("requests.packages.urllib3.contrib.ntlmpool")
    vendored("requests.packages.urllib3.contrib.pyopenssl")
    vendored("requests.packages.urllib3.exceptions")
    vendored("requests.packages.urllib3.fields")
    vendored("requests.packages.urllib3.filepost")
    vendored("requests.packages.urllib3.packages")
    try:
        vendored("requests.packages.urllib3.packages.ordered_dict")
        vendored("requests.packages.urllib3.packages.six")
    except ImportError:
        # Debian already unbundles these from requests.
        pass
    vendored("requests.packages.urllib3.packages.ssl_match_hostname")
    vendored("requests.packages.urllib3.packages.ssl_match_hostname."
             "_implementation")
    vendored("requests.packages.urllib3.poolmanager")
    vendored("requests.packages.urllib3.request")
    vendored("requests.packages.urllib3.response")
    vendored("requests.packages.urllib3.util")
    vendored("requests.packages.urllib3.util.connection")
    vendored("requests.packages.urllib3.util.request")
    vendored("requests.packages.urllib3.util.response")
    vendored("requests.packages.urllib3.util.retry")
    vendored("requests.packages.urllib3.util.ssl_")
    vendored("requests.packages.urllib3.util.timeout")
    vendored("requests.packages.urllib3.util.url")
    vendored("toml")
    vendored("toml.encoder")
    vendored("toml.decoder")
    vendored("urllib3")
fantauzzi commented 1 year ago

I have been working on the hypothesis that the pip module used by mlflow run . doesn't come bundled with a six module. That is the root cause of the No module named 'pip._vendor.six error also in the StackOverflow posts you linked.

So far I am not able to tell which pip module mlflow run . uses to make its own virtualenv.

The pip I have under ~/.local seems to be bundled with a six module, as its _vendor contains a six.py The system-wide pip in /usr/local/bin/pip didn't seem to be bundled with a six module, as its _vendor was almost empty and with no six.py, so I have re-installed it, and now it does have a populated _vendor also with a six.py.

The error with mlflow run . persists.

harupy commented 1 year ago

@fantauzzi Can you remove home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 and run mlflow run . again?

The system-wide pip in /usr/local/bin/pip didn't seem to be bundled with a six module

Maybe virtualenv picked up this unbundled pip and hit the error?

harupy commented 1 year ago

@fantauzzi Feel free to open this issue if re-creating the virtual vironment with the bundled pip doesn't work.

fantauzzi commented 1 year ago

@fantauzzi Can you remove home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 and run mlflow run . again?

The system-wide pip in /usr/local/bin/pip didn't seem to be bundled with a six module

Maybe virtualenv picked up this unbundled pip and hit the error?

After removing that and running mlflow run . again, I get the same error again. In fact, to reproduce the error, I must remove the half-backed virtualenv from home/ubuntu/.mlflow/envs before running mlflow run . again.

If instead I run mlflow run . again with the half-backed virtualenv still there, mlflow tries to use it and fails with a different error message (ModuleNotFoundError: No module named 'pmdarima'). Here an example, scroll down to see the second attempt to run mlflow run .:

(python3.10.7) ubuntu@192-9-242-164:~/mlflow/examples/diviner$ mlflow run .
2023/04/26 05:07:31 INFO mlflow.utils.virtualenv: Installing python 3.8.16 if it does not exist
Downloading Python-3.8.16.tar.xz...
-> https://www.python.org/ftp/python/3.8.16/Python-3.8.16.tar.xz
Installing Python-3.8.16...
Installed Python-3.8.16 to /home/ubuntu/.pyenv/versions/3.8.16
2023/04/26 05:08:12 INFO mlflow.utils.virtualenv: Creating a new environment in /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 with /home/ubuntu/.pyenv/versions/3.8.16/bin/python
created virtual environment CPython3.8.16.final.0-64 in 213ms
  creator CPython3Posix(dest=/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6, clear=False, global=False)
  seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
2023/04/26 05:08:13 INFO mlflow.utils.virtualenv: Installing dependencies
Traceback (most recent call last):
  File "/home/ubuntu/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/ubuntu/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/__main__.py", line 16, in <module>
    from pip._internal.cli.main import main as _main  # isort:skip # noqa
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/main.py", line 10, in <module>
    from pip._internal.cli.autocompletion import autocomplete
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/autocompletion.py", line 9, in <module>
    from pip._internal.cli.main_parser import create_main_parser
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/main_parser.py", line 7, in <module>
    from pip._internal.cli import cmdoptions
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/cli/cmdoptions.py", line 24, in <module>
    from pip._internal.exceptions import CommandError
  File "/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/lib/python3.8/site-packages/pip/_internal/exceptions.py", line 10, in <module>
    from pip._vendor.six import iteritems
ModuleNotFoundError: No module named 'pip._vendor.six'
Traceback (most recent call last):
  File "/home/ubuntu/.pyenv/versions/python3.10.7/bin/mlflow", line 8, in <module>
    sys.exit(cli())
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/cli.py", line 202, in run
    projects.run(
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/projects/__init__.py", line 338, in run
    submitted_run_obj = _run(
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/projects/__init__.py", line 105, in _run
    submitted_run = backend.run(
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/projects/backend/local.py", line 167, in run
    activate_cmd = _create_virtualenv(
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/utils/virtualenv.py", line 269, in _create_virtualenv
    _exec_cmd(cmd, capture_output=capture_output, cwd=tmp_model_dir, extra_env=extra_env)
  File "/home/ubuntu/.pyenv/versions/3.10.7/envs/python3.10.7/lib/python3.10/site-packages/mlflow/utils/process.py", line 117, in _exec_cmd
    raise ShellCommandException.from_completed_process(comp_process)
mlflow.utils.process.ShellCommandException: Non-zero exit code: 1
Command: ['bash', '-c', 'source /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/bin/activate && python -m pip install --quiet -r requirements.1ff80735675c49ec986a3849caf7600f.txt']
(python3.10.7) ubuntu@192-9-242-164:~/mlflow/examples/diviner$ mlflow run .
2023/04/26 05:09:52 INFO mlflow.utils.virtualenv: Installing python 3.8.16 if it does not exist
2023/04/26 05:09:52 INFO mlflow.utils.virtualenv: Environment /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 already exists
2023/04/26 05:09:52 INFO mlflow.projects.utils: === Created directory /tmp/tmpg3aq2q9y for downloading remote URIs passed to arguments of type 'path' ===
2023/04/26 05:09:52 INFO mlflow.projects.backend.local: === Running command 'source /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6/bin/activate && python train.py' in run with ID '17ad76248b7b4f699d2b67762e9c1cb1' === 
Traceback (most recent call last):
  File "train.py", line 2, in <module>
    from pmdarima import datasets
ModuleNotFoundError: No module named 'pmdarima'
2023/04/26 05:09:52 ERROR mlflow.cli: === Run (ID '17ad76248b7b4f699d2b67762e9c1cb1') failed ===
(python3.10.7) ubuntu@192-9-242-164:~/mlflow/examples/diviner$ 
fantauzzi commented 1 year ago

@fantauzzi Feel free to open this issue if re-creating the virtual vironment with the bundled pip doesn't work.

I don't think I am allowed, as a collaborator closed it.

harupy commented 1 year ago

@fantauzzi I ran mlflow run . and got the following:

> mlflow run .
2023/04/26 14:39:36 INFO mlflow.utils.virtualenv: Installing python 3.8.16 if it does not exist
2023/04/26 14:39:36 INFO mlflow.utils.virtualenv: Creating a new environment in /home/haru/.mlflow/envs/mlflow-0c30c75c1190ace1070fa325506dbb84f690a784 with /home/haru/.pyenv/versions/3.8.16/bin/python
created virtual environment CPython3.8.16.final.0-64 in 205ms
  creator CPython3Posix(dest=/home/haru/.mlflow/envs/mlflow-0c30c75c1190ace1070fa325506dbb84f690a784, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/haru/.local/share/virtualenv)
  ^ 👆 is different
    added seed packages: pip==23.0.1, setuptools==67.6.1, wheel==0.40.0
harupy commented 1 year ago

Can you try removing /home/ubuntu/.local/share/virtualenv?

fantauzzi commented 1 year ago

Because I reproduce the error in a cloud setup with Ubuntu 20.04, but don't reproduce it on my PC with Ubuntu 22.04, I have compared the output of mlflow run . between the two, and spotted an interesting difference. I copy and paste here the relevant output and then outline the difference.

Here on the setup that reproduces the error:

$ mlflow run .
2023/04/26 05:07:31 INFO mlflow.utils.virtualenv: Installing python 3.8.16 if it does not exist
Downloading Python-3.8.16.tar.xz...
-> https://www.python.org/ftp/python/3.8.16/Python-3.8.16.tar.xz
Installing Python-3.8.16...
Installed Python-3.8.16 to /home/ubuntu/.pyenv/versions/3.8.16
2023/04/26 05:08:12 INFO mlflow.utils.virtualenv: Creating a new environment in /home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 with /home/ubuntu/.pyenv/versions/3.8.16/bin/python
created virtual environment CPython3.8.16.final.0-64 in 213ms
  creator CPython3Posix(dest=/home/ubuntu/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6, clear=False, global=False)
  seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
2023/04/26 05:08:13 INFO mlflow.utils.virtualenv: Installing dependencies
Traceback (most recent call last):
[...]

Here on the setup that doesn't reproduce the error, and the example code completes correctly

-> https://www.python.org/ftp/python/3.8.16/Python-3.8.16.tar.xz
Installing Python-3.8.16...
Installed Python-3.8.16 to /home/fanta/.pyenv/versions/3.8.16
2023/04/26 07:29:26 INFO mlflow.utils.virtualenv: Creating a new environment in /home/fanta/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6 with /home/fanta/.pyenv/versions/3.8.16/bin/python
created virtual environment CPython3.8.16.final.0-64 in 189ms
  creator CPython3Posix(dest=/home/fanta/.mlflow/envs/mlflow-dc1faba5c996e345dd1cc40210944f3b69389fb6, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/fanta/.local/share/virtualenv)
    added seed packages: pip==22.3.1, setuptools==65.6.3, wheel==0.38.4
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
2023/04/26 07:29:26 INFO mlflow.utils.virtualenv: Installing dependencies
[...]

I believe the relevant difference is in the seeder FromAppData( part, which in the working setup contains pip=bundle, setuptools=bundle, wheel=bundle while in the faulty setup contains pip=latest, setuptools=latest, wheel=latest

MLFlow seems to be using a bundled pip on one system, and a different (non bundled) pip on another system.

harupy commented 1 year ago

Can you try removing /home/ubuntu/.local/share/virtualenv?

Have you tried this?

fantauzzi commented 1 year ago

/home/ubuntu/.local/share/virtualenv

I was typing in this issue at the same time as you and noticed later you had updated it. So now I have removed that directory, removed /home/ubuntu/.mlflow/envs/mlflow*, run mlflow run . again, and got the usual error. Also, mlflow run . has recreated the directory I had just removed, with this content:

$ ll /home/ubuntu/.local/share/virtualenv
total 16
drwxrwxr-x 4 ubuntu ubuntu 4096 Apr 26 06:05 ./
drwxrwxr-x 7 ubuntu ubuntu 4096 Apr 26 06:05 ../
drwxrwxr-x 3 ubuntu ubuntu 4096 Apr 26 06:05 py_info/
drwxrwxr-x 3 ubuntu ubuntu 4096 Apr 26 06:05 seed-app-data/
harupy commented 1 year ago

@fantauzzi Can you reproduce the error by directly running virtualenv to create an environment and running pip install in it?

fantauzzi commented 1 year ago

Yes.

$ virtualenv --version
virtualenv 20.0.17 from /usr/lib/python3/dist-packages/virtualenv/__init__.py
$ virtualenv --python /home/ubuntu/.pyenv/versions/3.10.7/bin/python /home/ubuntu/my_virtualenv
created virtual environment CPython3.10.7.final.0-64 in 89ms
  creator CPython3Posix(dest=/home/ubuntu/my_virtualenv, clear=False, global=False)
  seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
$ source /home/ubuntu/my_virtualenv/bin/activate
$ pip --version
Traceback (most recent call last):
  File "/home/ubuntu/my_virtualenv/bin/pip", line 5, in <module>
    from pip._internal.cli.main import main
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/main.py", line 10, in <module>
    from pip._internal.cli.autocompletion import autocomplete
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/autocompletion.py", line 9, in <module>
    from pip._internal.cli.main_parser import create_main_parser
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/main_parser.py", line 7, in <module>
    from pip._internal.cli import cmdoptions
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/cmdoptions.py", line 24, in <module>
    from pip._internal.exceptions import CommandError
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/exceptions.py", line 10, in <module>
    from pip._vendor.six import iteritems
ModuleNotFoundError: No module named 'pip._vendor.six'
$ 
harupy commented 1 year ago

Can you run python -c 'import six'?

fantauzzi commented 1 year ago

Can you run python -c 'import six'?

From within the virtualenv I have just created, I cannot import six:

$ python -c 'import six'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'six'
$ 
harupy commented 1 year ago

@fantauzzi Does /usr/share/python-wheels exist?

fantauzzi commented 1 year ago

/usr/share/python-wheels

Yes

$ ll /usr/share/python-wheels
total 2260
drwxr-xr-x   2 root root  12288 Apr 26 04:47 ./
drwxr-xr-x 177 root root   4096 Apr 26 04:50 ../
-rw-r--r--   1 root root  28023 Feb 28 09:41 CacheControl-0.12.6-py2.py3-none-any.whl
-rw-r--r--   1 root root  18776 Feb 28 09:41 appdirs-1.4.3-py2.py3-none-any.whl
-rw-r--r--   1 root root 164552 Feb 28 09:41 certifi-2019.11.28-py2.py3-none-any.whl
-rw-r--r--   1 root root 141487 Feb 28 09:41 chardet-3.0.4-py2.py3-none-any.whl
-rw-r--r--   1 root root  25094 Feb 28 09:41 colorama-0.4.3-py2.py3-none-any.whl
-rw-r--r--   1 root root  17188 Feb 28 09:41 contextlib2-0.6.0-py2.py3-none-any.whl
-rw-r--r--   1 root root 152027 Feb 28 09:41 distlib-0.3.0-py2.py3-none-any.whl
-rw-r--r--   1 root root  23898 Feb 28 09:41 distro-1.4.0-py2.py3-none-any.whl
-rw-r--r--   1 root root 120020 Feb 28 09:41 html5lib-1.0.1-py2.py3-none-any.whl
-rw-r--r--   1 root root  66836 Feb 28 09:41 idna-2.8-py2.py3-none-any.whl
-rw-r--r--   1 root root  24287 Feb 28 09:41 ipaddr-2.2.0-py2.py3-none-any.whl
-rw-r--r--   1 root root  21972 Feb 28 09:41 lockfile-0.12.2-py2.py3-none-any.whl
-rw-r--r--   1 root root  92927 Feb 28 09:41 msgpack-0.6.2-py2.py3-none-any.whl
-rw-r--r--   1 root root  42242 Feb 28 09:41 packaging-20.3-py2.py3-none-any.whl
-rw-r--r--   1 root root  26686 Feb 28 09:41 pep517-0.8.2-py2.py3-none-any.whl
-rw-r--r--   1 root root 262440 Feb 28 09:41 pip-20.0.2-py2.py3-none-any.whl
-rw-r--r--   1 root root 127312 Feb 28 09:41 pkg_resources-0.0.0-py2.py3-none-any.whl
-rw-r--r--   1 root root  17547 Feb 28 09:41 progress-1.5-py2.py3-none-any.whl
-rw-r--r--   1 root root  77093 Feb 28 09:41 pyparsing-2.4.6-py2.py3-none-any.whl
-rw-r--r--   1 root root  67470 Feb 28 09:41 requests-2.22.0-py2.py3-none-any.whl
-rw-r--r--   1 root root  16358 Feb 28 09:41 retrying-1.3.3-py2.py3-none-any.whl
-rw-r--r--   1 root root 477455 Feb 28 09:41 setuptools-44.0.0-py2.py3-none-any.whl
-rw-r--r--   1 root root  20256 Feb 28 09:41 six-1.14.0-py2.py3-none-any.whl
-rw-r--r--   1 root root  24106 Feb 28 09:41 toml-0.10.0-py2.py3-none-any.whl
-rw-r--r--   1 root root 127068 Feb 28 09:41 urllib3-1.25.8-py2.py3-none-any.whl
-rw-r--r--   1 root root  20484 Feb 28 09:41 webencodings-0.5.1-py2.py3-none-any.whl
-rw-r--r--   1 root root  35613 Feb 28 09:41 wheel-0.34.2-py2.py3-none-any.whl
harupy commented 1 year ago

Can you modify _vendor/__init__.py to see why it failed to import six? You can insert print in the vendored function.

def vendored(modulename):
    vendored_name = "{0}.{1}".format(__name__, modulename)

    try:
        __import__(modulename, globals(), locals(), level=0)
    except ImportError as e:
        # We can just silently allow import failures to pass here. If we
        # got to this point it means that ``import pip._vendor.whatever``
        # failed and so did ``import whatever``. Since we're importing this
        # upfront in an attempt to alias imports, not erroring here will
        # just mean we get a regular import error whenever pip *actually*
        # tries to import one of these modules to use it, which actually
        # gives us a better error message than we would have otherwise
        # gotten.
        print(e)
    else:
        sys.modules[vendored_name] = sys.modules[modulename]
        base, head = vendored_name.rsplit(".", 1)
        setattr(sys.modules[base], head, sys.modules[modulename])
fantauzzi commented 1 year ago

Can you modify _vendor/__init__.py to see why it failed to import six? You can insert print in the vendored function.

I have modified file /home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_vendor/__init__.py , this is the output

$ pip --version
No module named 'appdirs'
No module named 'cachecontrol'
No module named 'colorama'
No module named 'contextlib2'
No module named 'distlib'
No module named 'distro'
No module named 'html5lib'
No module named 'six'
No module named 'six'
No module named 'six'
No module named 'six'
No module named 'packaging'
No module named 'packaging'
No module named 'packaging'
No module named 'pep517'
No module named 'progress'
No module named 'retrying'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'requests'
No module named 'toml'
No module named 'toml'
No module named 'toml'
No module named 'urllib3'
Traceback (most recent call last):
  File "/home/ubuntu/my_virtualenv/bin/pip", line 5, in <module>
    from pip._internal.cli.main import main
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/main.py", line 10, in <module>
    from pip._internal.cli.autocompletion import autocomplete
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/autocompletion.py", line 9, in <module>
    from pip._internal.cli.main_parser import create_main_parser
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/main_parser.py", line 7, in <module>
    from pip._internal.cli import cmdoptions
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/cli/cmdoptions.py", line 24, in <module>
    from pip._internal.exceptions import CommandError
  File "/home/ubuntu/my_virtualenv/lib/python3.10/site-packages/pip/_internal/exceptions.py", line 10, in <module>
    from pip._vendor.six import iteritems
ModuleNotFoundError: No module named 'pip._vendor.six'
$
harupy commented 1 year ago

Can you print the value of WHEEL_DIR?

fantauzzi commented 1 year ago

Can you print the value of WHEEL_DIR?

The env. variable WHEEL_DIR is not set, neither in the shell nor in __init_.py (I had it printed right after print(e)).

harupy commented 1 year ago

What does your _vendor/__init__.py look like?

fantauzzi commented 1 year ago

What does your _vendor/__init__.py look like?

This is pip/_vendor/__init__.py

pip._vendor is for vendoring dependencies of pip to prevent needing pip to
depend on something external.

Files inside of pip._vendor should be considered immutable and should only be
updated to versions from upstream.
"""
from __future__ import absolute_import

import glob
import os.path
import sys

# Downstream redistributors which have debundled our dependencies should also
# patch this value to be true. This will trigger the additional patching
# to cause things like "six" to be available as pip.
DEBUNDLED = True

# By default, look in this directory for a bunch of .whl files which we will
# add to the beginning of sys.path before attempting to import anything. This
# is done to support downstream re-distributors like Debian and Fedora who
# wish to create their own Wheels for our dependencies to aid in debundling.
prefix = getattr(sys, "base_prefix", sys.prefix)
if prefix.startswith('/usr/lib/pypy'):
    prefix = '/usr'
WHEEL_DIR = os.path.abspath(os.path.join(prefix, 'share', 'python-wheels'))

# Define a small helper function to alias our vendored modules to the real ones
# if the vendored ones do not exist. This idea of this was taken from
# https://github.com/kennethreitz/requests/pull/2567.
def vendored(modulename):
    vendored_name = "{0}.{1}".format(__name__, modulename)

    try:
        __import__(modulename, globals(), locals(), level=0)
    except ImportError as e:
        # We can just silently allow import failures to pass here. If we
        # got to this point it means that ``import pip._vendor.whatever``
        # failed and so did ``import whatever``. Since we're importing this
        # upfront in an attempt to alias imports, not erroring here will
        # just mean we get a regular import error whenever pip *actually*
        # tries to import one of these modules to use it, which actually
        # gives us a better error message than we would have otherwise
        # gotten.
        print(f"WHEEL_DIR is {print(os.environ.get('WHEEL_DIR', None))}")
        print(e)
    else:
        sys.modules[vendored_name] = sys.modules[modulename]
        base, head = vendored_name.rsplit(".", 1)
        setattr(sys.modules[base], head, sys.modules[modulename])

# If we're operating in a debundled setup, then we want to go ahead and trigger
# the aliasing of our vendored libraries as well as looking for wheels to add
# to our sys.path. This will cause all of this code to be a no-op typically
# however downstream redistributors can enable it in a consistent way across
# all platforms.
if DEBUNDLED:
    # Actually look inside of WHEEL_DIR to find .whl files and add them to the
    # front of our sys.path.
    sys.path[:] = glob.glob(os.path.join(WHEEL_DIR, "*.whl")) + sys.path

    # Actually alias all of our vendored dependencies.
    vendored("appdirs")
    vendored("cachecontrol")
    vendored("colorama")
    vendored("contextlib2")
    vendored("distlib")
    vendored("distro")
    vendored("html5lib")
    vendored("six")
    vendored("six.moves")
    vendored("six.moves.urllib")
    vendored("six.moves.urllib.parse")
    vendored("packaging")
    vendored("packaging.version")
    vendored("packaging.specifiers")
    vendored("pep517")
    vendored("pkg_resources")
    vendored("progress")
    vendored("retrying")
    vendored("requests")
    vendored("requests.exceptions")
    vendored("requests.packages")
    vendored("requests.packages.urllib3")
    vendored("requests.packages.urllib3._collections")
    vendored("requests.packages.urllib3.connection")
    vendored("requests.packages.urllib3.connectionpool")
    vendored("requests.packages.urllib3.contrib")
    vendored("requests.packages.urllib3.contrib.ntlmpool")
    vendored("requests.packages.urllib3.contrib.pyopenssl")
    vendored("requests.packages.urllib3.exceptions")
    vendored("requests.packages.urllib3.fields")
    vendored("requests.packages.urllib3.filepost")
    vendored("requests.packages.urllib3.packages")
    try:
        vendored("requests.packages.urllib3.packages.ordered_dict")
        vendored("requests.packages.urllib3.packages.six")
    except ImportError:
        # Debian already unbundles these from requests.
        pass
    vendored("requests.packages.urllib3.packages.ssl_match_hostname")
    vendored("requests.packages.urllib3.packages.ssl_match_hostname."
             "_implementation")
    vendored("requests.packages.urllib3.poolmanager")
    vendored("requests.packages.urllib3.request")
    vendored("requests.packages.urllib3.response")
    vendored("requests.packages.urllib3.util")
    vendored("requests.packages.urllib3.util.connection")
    vendored("requests.packages.urllib3.util.request")
    vendored("requests.packages.urllib3.util.response")
    vendored("requests.packages.urllib3.util.retry")
    vendored("requests.packages.urllib3.util.ssl_")
    vendored("requests.packages.urllib3.util.timeout")
    vendored("requests.packages.urllib3.util.url")
    vendored("toml")
    vendored("toml.encoder")
    vendored("toml.decoder")
    vendored("urllib3")
harupy commented 1 year ago

Can you run print(WHEEL_DIR)?

harupy commented 1 year ago

In _vendor/__init__.py, you see WHEEL_DIR, right?

fantauzzi commented 1 year ago

In _vendor/__init__.py, you see WHEEL_DIR, right?

Got it. WHEEL_DIR is /home/ubuntu/.pyenv/versions/3.10.7/share/python-wheels

harupy commented 1 year ago

What is the output of ls /home/ubuntu/.pyenv/versions/3.10.7/share/python-wheels?

fantauzzi commented 1 year ago

What is the output of ls /home/ubuntu/.pyenv/versions/3.10.7/share/python-wheels?

That directory doesn't exist. This is the content of the parent directory

$ ls -laF /home/ubuntu/.pyenv/versions/3.10.7/share
total 12
drwxr-xr-x 3 ubuntu ubuntu 4096 Apr 26 05:01 ./
drwxr-xr-x 7 ubuntu ubuntu 4096 Apr 26 05:02 ../
drwxr-xr-x 3 ubuntu ubuntu 4096 Apr 26 05:01 man/
harupy commented 1 year ago

There is no six. I'm not sure why six is missing.

harupy commented 1 year ago

Maybe reinstalling pyenv might help.

fantauzzi commented 1 year ago

Maybe reinstalling pyenv might help.

I have done it multiple times, also today. Every time I have started a new compute instance in a cloud (Lambda Cloud), and then reinstalled pyenv and mlflow.

harupy commented 1 year ago

This looks like a pyenv issue. Can you file an issue in https://github.com/pyenv/pyenv?

fantauzzi commented 1 year ago

Seems to me it might be related just to virtualenv. On the setup that reproduces the issue, with no pyenv installed, I already get this: in spite of the fact that the pip installation I am running contains a _vendor/six.py, the virtualenv I made with virtualenv doesn't contain it.

ubuntu@192-9-133-158:~$ which python
/usr/bin/python
ubuntu@192-9-133-158:~$ which pip
/home/ubuntu/.local/bin/pip
ubuntu@192-9-133-158:~$ which virtualenv
/usr/bin/virtualenv
ubuntu@192-9-133-158:~$ python --version
Python 3.8.10
ubuntu@192-9-133-158:~$ pip --version
pip 22.3 from /home/ubuntu/.local/lib/python3.8/site-packages/pip (python 3.8)
ubuntu@192-9-133-158:~$ ll /home/ubuntu/.local/lib/python3.8/site-packages/pip/_vendor/six.py 
-rw-rw-r-- 1 ubuntu ubuntu 34549 Oct 26 22:37 /home/ubuntu/.local/lib/python3.8/site-packages/pip/_vendor/six.py
ubuntu@192-9-133-158:~$ 
ubuntu@192-9-133-158:~$ virtualenv --python /usr/bin/python ~/enva
created virtual environment CPython3.8.10.final.0-64 in 166ms
  creator CPython3Posix(dest=/home/ubuntu/enva, clear=False, global=False)
  seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1)
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
ubuntu@192-9-133-158:~$ . ~/enva/bin/activate
(enva) ubuntu@192-9-133-158:~$ which pip
/home/ubuntu/enva/bin/pip
(enva) ubuntu@192-9-133-158:~$ which python
/home/ubuntu/enva/bin/python
(enva) ubuntu@192-9-133-158:~$ pip --version
pip 20.0.2 from /home/ubuntu/enva/lib/python3.8/site-packages/pip (python 3.8)
(enva) ubuntu@192-9-133-158:~$ ll /home/ubuntu/enva/lib/python3.8/site-packages/pip/_vendor
total 20
drwxrwxr-x 3 ubuntu ubuntu 4096 Apr 26 11:04 ./
drwxrwxr-x 5 ubuntu ubuntu 4096 Apr 26 11:04 ../
-rw-rw-r-- 1 ubuntu ubuntu 4975 Apr 26 11:02 __init__.py
drwxrwxr-x 2 ubuntu ubuntu 4096 Apr 26 11:04 __pycache__/

The line seeder FromAppData(download=False, pip=latest, setuptools=latest, wheel=latest, pkg_resources=latest, via=copy, app_data_dir=/home/ubuntu/.local/share/virtualenv/seed-app-data/v1.0.1.debian.1) references an app_data_dir that didn't exist, and virtualenv has created itself.

On the PC where mlflow works correctly I get this instead

⮕ virtualenv --python /usr/bin/python3 ~/enva
created virtual environment CPython3.10.6.final.0-64 in 69ms
  creator CPython3Posix(dest=/home/fanta/enva, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/fanta/.local/share/virtualenv)
    added seed packages: pip==22.0.2, setuptools==59.6.0, wheel==0.37.1
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
harupy commented 1 year ago

Got it, then can you report that in https://github.com/pypa/virtualenv?

harupy commented 1 year ago

Maybe something is wrong with Ubuntu instances that Lambda Cloud provides. Can you reproduce this issue using Docker?