Editable installs are broken after `__editable__` and `__path_hook__` changes

abhinavsingh commented 2 years ago

Description

I am starting to observe broken editable installs since past week.

Specifically, editable install now creates a __editable__.<pkg>.pth and __editable___<pkg>_finder.py files. These paths are returned as part of namespace.__path__, which has broken a lot of tooling.

pdoc3 went broke. Skipping __editable__ package seems to fix (hack) it.
- See https://github.com/pdoc3/pdoc/pull/408
We also noticed, namespace packages that installs a CLI are now also broken, because while the binary is under .venv/bin, it is unable to import the modules since it was installed using -e.
- Modifying PYTHONPATH is the only option.
We also noticed, due to these changes even VSCode/Pylance is now broken, as it is unable to resolve paths to editable installed packages.
- Adding "python.analysis.extraPaths" to .vscode/settings.json is the way out

I was unable to find any search result on Google for __editable__ and __path_hook__. So, I am unsure what I am up against here.

Expected behavior

Installed CLI from namespace package must work fine without modification of PYTHONPATH
namespace.__path__ must return actual path instead of __editable__.* which seems to break a lot of existing tooling

pip version

22.2.2

Python version

3.10.5

OS

MacOS 12.5

How to Reproduce

Create a namespace package that installs CLI (entry point)
Install the package with -e
Installed CLI is broken due to missing PYTHONPATH

Our projects are using pyproject.toml and setup.cfg.

Output

`ImportError: cannot import name 'entry_point' from 'namespace.cli' (unknown location)`

Code of Conduct

[X] I agree to follow the PSF Code of Conduct.

pfmoore commented 2 years ago

This is likely because setuptools released a new mechanism for implementing editable installs, as part of their PEP600 support. The pdoc3 project is probably not yet aware of the new mechanism, and will need updating.

There isn’t really anything for pip to do here, so I’m closing this issue. If it turns out that there is a pip problem here, feel free to reopen it.

abhinavsingh commented 2 years ago

@pfmoore This is likely going to create a lot of havoc. Editable install is how people operate in Python community. And if all of a sudden, entire tooling around editable install is going to break, then likely we must be planning better around it.

My 2 cents. May be, instead of enforcing the new mechanism as default, this could have been an opt in to start with. Instead of fighting this out, we decided to simply pin setuptools "setuptools <= 62.6.0". While this has put us out of misery, we are now stuck in the past. FWIW, new mechanism has decided to return a Path which returns in __path_hook__ as part of namespace. While that file actually doesn't exist on the disk.

pradyunsg commented 2 years ago

Please reach out to the setuptools maintainers. While various Python packaging tools do interoperate, the behaviour changes you’re seeing come from setuptools.

The maintainers of pip don’t control / maintain setuptools, which is where I recommend you reach out.

pradyunsg commented 2 years ago

I can actually transfer the issue, so… here you go. :)

potiuk commented 2 years ago

It looks like we started to have the same problem in Airflow.

pip install -e . after checking out basically any version (including main) and running any Airlfow cli command results in

Traceback (most recent call last):
  File "/home/jarek/.pyenv/versions/airflow-test-cli3/bin/airflow", line 8, in <module>
    sys.exit(main())
  File "/home/jarek/code/airflow/airflow/__main__.py", line 38, in main
    args.func(args)
  File "/home/jarek/code/airflow/airflow/cli/cli_parser.py", line 50, in command
    func = import_string(import_path)
  File "/home/jarek/code/airflow/airflow/utils/module_loading.py", line 32, in import_string
    module = import_module(module_path)
  File "/home/jarek/.pyenv/versions/3.9.13/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/home/jarek/code/airflow/airflow/cli/commands/standalone_command.py", line 35, in <module>
    from airflow.www.app import cached_app
  File "/home/jarek/code/airflow/airflow/www/app.py", line 43, in <module>
    from airflow.www.extensions.init_views import (
  File "/home/jarek/code/airflow/airflow/www/extensions/init_views.py", line 28, in <module>
    from airflow.www.views import lazy_add_provider_discovered_options_to_connection_form
  File "/home/jarek/code/airflow/airflow/www/views.py", line 685, in <module>
    class AirflowBaseView(BaseView):
  File "/home/jarek/code/airflow/airflow/www/views.py", line 699, in AirflowBaseView
    extra_args['sqlite_warning'] = settings.engine.dialect.name == 'sqlite'
AttributeError: module 'airflow.settings' has no attribute 'engine'

This started to happen few days ago and adding Airlfow's source to PYTHONPATH solves the problem.

I did some bisecting and pin-pointed it to version 64.0.0 of setuptools. Version 63.4.3 worked perfectly fine, 64.0.0 breaks it. The reason seems to be as described in the issue. I've added those lines to our pyproject.toml and they solved the problem:

[build-system]
requires = ['setuptools==63.4.3']
build-backend = "setuptools.build_meta"

See: https://github.com/apache/airflow/pull/25848

It is super-easy to reproduce now:

1) checkout main from https://github.com/apache/airflow 2) create and activate a new virtualenv 3) run pip install -e . 4) run airflow standalone

This works perfectly well (because of the 63.4.3 limit)

5) change the line in pyproject.toml to:

requires = ['setuptools==64.0.0']

6) create and activate a new virtualenv 7) run pip install -e . 8) run airflow standalone

You see the stack-trace above

potiuk commented 2 years ago

(The error is also present in all other 64. and 65 versions BTW).

abravalheri commented 2 years ago

Hi @abhinavsingh,

Could you provide a minimal reproducer explaining more in details the issues you are having? (Maybe it is worthy to split multiple issues in multiple reproducers).

I tried to investigate what you are describing with the following reproducer, but I was not able to find the problematic behaviour you are referring to... (Maybe I understood something wrong? Or maybe the reproducer is missing something? Or -- best case scenario -- the problem might have already gotten solved in the latest release?).

# docker run --rm -it python:3.10.6 bash

cd /tmp && rm -rf /tmp/proj_root
mkdir -p /tmp/proj_root
mkdir -p /tmp/proj_root/namespace/cli
echo "def run(): print('hello world')" > /tmp/proj_root/namespace/cli/__init__.py
cat <<EOF > /tmp/proj_root/pyproject.toml
[build-system]
requires = ["setuptools>=65.1.0"]
build-backend = "setuptools.build_meta"

[project]
name = "namespace.cli"
version = "42"

[project.scripts]
namespace-cli = "namespace.cli:run"
EOF
cd /tmp/proj_root
python3.10 -m venv .venv
.venv/bin/python -m pip install -U pip
.venv/bin/python -m pip install -e .
cd /var
/tmp/proj_root/.venv/bin/python -c 'import namespace; print(f"{namespace.__path__=}")'
# ==> namespace.__path__=_NamespacePath(['/tmp/proj_root/namespace'])
/tmp/proj_root/.venv/bin/namespace-cli
# ==> hello world

As you can see above, the reproducer show that whenever possible setuptools will add the corresponding path entry to __path__. Cases when this does not occur are likely associated with configurations by which users explicitly opt out from certain packages or there is simply no directory in the file system for the package to be associated with...

There are a few other remarks I would like to make:

In Python, path entries do not need to be existing locations in the file system, this is mentioned in Python docs:

Path entries need not be limited to file system locations. They can refer to URLs, database queries, or any other location that can be specified as a string. ... Most path entries name locations in the file system, but they need not be limited to this. ... Entries in sys.path can name directories on the file system, zip files, and potentially other “locations” (see the site module) that should be searched for modules, such as URLs, or database queries.

We can also demonstrate this with the following snippet:
```
# docker run --rm -it python:3.10.6 bash
python3.10 -c 'import sys; print(sys.path)'
# ==> ['', '/usr/local/lib/python310.zip', '/usr/local/lib/python3.10', '/usr/local/lib/python3.10/lib-dynload', '/usr/local/lib/python3.10/site-packages']
ls -la /usr/local/lib/python310.zip
# ==> ls: cannot access '/usr/local/lib/python310.zip': No such file or directory
```
Most static analysis tools (including the ones used in code editors) have limitations in terms of aspects of the Python language they cannot handle. Import hooks (which have been part of the Python language specification for years, and are listed in PEP 660 as a perfectly viable implementation for editable installs), are notably one of these limitations. This is reported in https://github.com/pypa/setuptools/issues/3518. Setuptools understand this limitation, and I am personally happy to contribute to an effort of improving this. However we do need someone to champion a design/architecture/standard that is well received in both packaging and static analysis community. I recommend anyone interested in pushing this forward to engage in the mailing list discussion, or on the thread on discourse.
I would also recommend setuptools' docs on editable installs. There might be some useful information there.

abravalheri commented 2 years ago

@potiuk I suspect airflow's use case is different.

By having a look on the setup.py, it seems that airflow implements a custom develop command.

Setuptools's implementation of PEP 660 does not use the develop command[^1]. Instead, setuptools now rely in a complete different command. Please have a look on the docs for a few possible options on how to customise the editable installation.

By running:

# docker run --rm -it python:3.10.6 bash

apt update && apt install -y git build-essential
cd /tmp
git clone https://github.com/apache/airflow
cd /tmp/airflow
sed -i 's/==63.4.3/>=65.1.0/g' pyproject.toml
python -m venv .venv
.venv/bin/python -m pip install -U pip
.venv/bin/python -m pip install -e .
cd /var
/tmp/airflow/.venv/bin/python -c 'import airflow.settings; print(dir(airflow.settings))'
# ==> ['AIRFLOW_HOME', 'AIRFLOW_MOVED_TABLE_PREFIX', 'ALLOW_FUTURE_EXEC_DATES', 'CAN_FORK', 'CHECK_SLAS', 'COMPRESS_SERIALIZED_DAGS', 'Callable', 'DAEMON_UMASK', 'DAGS_FOLDER', 'DASHBOARD_UIALERTS', 'DEFAULT_ENGINE_ARGS', 'DONOT_MODIFY_HANDLERS', 'EXECUTE_TASKS_NEW_PYTHON_INTERPRETER', 'Engine', 'GUNICORN_WORKER_READY_PREFIX', 'HEADER', 'HIDE_SENSITIVE_VAR_CONN_FIELDS', 'IS_K8S_OR_K8SCELERY_EXECUTOR', 'KILOBYTE', 'LAZY_LOAD_PLUGINS', 'LAZY_LOAD_PROVIDERS', 'LOGGING_CLASS_PATH', 'LOGGING_LEVEL', 'LOG_FORMAT', 'List', 'MASK_SECRETS_IN_LOGS', 'MEGABYTE', 'MIN_SERIALIZED_DAG_FETCH_INTERVAL', 'MIN_SERIALIZED_DAG_UPDATE_INTERVAL', 'NullPool', 'Optional', 'PLUGINS_FOLDER', 'SASession', 'SIMPLE_LOG_FORMAT', 'SQL_ALCHEMY_CONN', 'STATE_COLORS', 'TIMEZONE', 'TYPE_CHECKING', 'USE_JOB_SCHEDULE', 'Union', 'WEBSERVER_CONFIG', 'WEB_COLORS', '__annotations__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_get_rich_console', 'atexit', 'conf', 'configure_action_logging', 'configure_adapters', 'configure_logging', 'configure_orm', 'configure_vars', 'create_engine', 'custom_show_warning', 'dag_policy', 'dispose_orm', 'exc', 'executor_constants', 'functools', 'get_airflow_context_vars', 'get_dagbag_import_timeout', 'get_session_lifetime_config', 'import_local_settings', 'initialize', 'json', 'log', 'logging', 'original_show_warning', 'os', 'pendulum', 'pod_mutation_hook', 'prepare_engine_args', 'prepare_syspath', 'reconfigure_orm', 'replace_showwarning', 'scoped_session', 'sessionmaker', 'setup_event_handlers', 'sqlalchemy', 'sys', 'task_instance_mutation_hook', 'task_policy', 'tz', 'validate_session', 'warnings']

I can see that the package is indeed installed in the editable mode, however the engine attribute does not seem to be initialised. Maybe this is related to the customisations you guys are doing in the develop command, but that are no longer invoked when you run pip install -e .?

[^1]: The PEP 660 standard is incompatible with the develop command. In the later, setuptools writes files directly to the installation directories, but the PEP 660 standard requires setuptools to handle a .whl file to pip, so pip can take care of creating files.

potiuk commented 2 years ago

By having a look on the setup.py, it seems that airflow implements a custom develop command.

Thanks for the pointer @abravalheri . I will take a closer look and see if I can work it out. I was kinda suspecting that we could have been doing something that PEP 660 does not really like :). I will keep you posted.

potiuk commented 2 years ago

Time to do some more deep PEP reading

ofek commented 2 years ago

@potiuk Hey! Airflow's setup.py looks quite complex which intrigues me. Would you be interested in me opening a PR for a PoC switch to Hatchling (which will fix your issue)?

abravalheri commented 2 years ago

(Copying some opinion from the PyPA discord for eventual readers of this thread).

setup.py implementations tend to grow complex with time. Any backend that supports extensions can be used to refactor (which includes setuptools). Documentation on custom build steps can be found at https://setuptools.pypa.io/en/latest/userguide/extension.html.

potiuk commented 2 years ago

@potiuk Hey! Airflow's setup.py looks quite complex which intrigues me. Would you be interested in me opening a PR for a PoC switch to Hatchling (which will fix your issue)?

@ofek Airlfow is an ASF project and community makes decisions not me personally. If you want to make a POC and have some good arguments, you will have to start dicsussion at the airflow devlist https://airflow.apache.org/community/ . But you have to - in general - have a good reasoning and justification for such a change in build system. just fixing an error that we can fix without switching to another build mechanism is not good enough reason.

lonetwin commented 2 years ago

Just a note here to say pkg_utlts.iter_modules() from the stdlib is also affected by this. It no longer returns the editable pacakges like it used to. Is this something that would have to be fixed in the stdlib ?

Edit: I'm sorry, ignore this. It was an editable namespace pacakge and those weren't returned by earlier as well, so not related to the issue reported here.

abravalheri commented 2 years ago

Hi @lonetwin, pkgutil.iter_modules does not cover the same spectrum as the importlib machinery: it is much more limited.

Based on this comment I would say that (so far) the stdlib developers have no intention the increase the scope or add features to pkgutil. The phrase used was: pkgutil is on its way out as soon as people have the time to deprecate it.

If you absolutely need to use pkgutil you can try to opt into a different editable installation mode (e.g. pip install -e . --config-settings editable_mode=strict) -- I haven't tested it myself, so I am not sure it will work.

pradyunsg commented 2 years ago

In the spirit of over-communicating: I'm unsubscribing from this since I don't think there's anything for me to do here. To the maintainers, please feel free to @-mention me in case there's something here that's on pip's end.

abhinavsingh commented 2 years ago

@abravalheri Apologies, haven't checked back on GitHub in weeks. Missed your message altogether. A point to note in our case is that, we have all our packages under a namespace. As others have figured, v64 is an issue. At our end, I had to pin setup tools to v62 to let our company workflows pass. In our setup, we simply go by paths returned by namespace.__path__. But ever since v64, we realise, __path__ contains entries which are actually not real file paths. They needs to be resolved using latest mechanism adopted by setup tools. The day I ran into it, I wasn't even able to find any formal documentation around it.

I am not on my work system today. But, I'll give you a reproducible scenario as soon as I get to my system. We surely don't want to pin ourself to v62 forever.

kptkin commented 1 year ago

Hey @abravalheri ,

Our repo has the same issue with editable mode (for us as well the issue appears in version setuptools==64). We have an "interesting" case, that one of the artifacts our app creates is a folder with the same name as our package (i know :/).

To reproduce you can run this:

# docker run --rm -it python:3.10.6 bash

cd /tmp
git clone https://github.com/wandb/wandb.git
cd /tmp/wandb
git checkout a21d2622dab04ace91d76ba05b1631fff9c891e
python -m venv .venv
.venv/bin/python -m pip install -U pip
.venv/bin/python -m pip install -e .

cd /var
#first time running the script
/tmp/wandb/.venv/bin/python -c "import wandb; run = wandb.init(mode='offline'); run.finish()" 
#second time running the script
/tmp/wandb/.venv/bin/python -c "import wandb; run = wandb.init(mode='offline'); run.finish()"

note that after the first execution of our script we create a folder with the same name as our project (so the first execution works). However, the second execution of the script fails, since the importer "finds" the package in the current working directory i.e. the newly created folder.

Thanks for your help!

remram44 commented 1 year ago

I'm running into a related issue, where after installing -e ./foo (containing ./foo/setup.py, ./foo/foo.py), somehow import foo gives me a namespace package for ./foo rather than the module ./foo/foo.py. This is a change in behavior that breaks various CI and seems extremely weird; why isn't an installed package preferred over an implicit namespace package?

I can pin setuptools 65 but this seems like a bug on your side.

ofek commented 1 year ago

Have you tried Hatchling?

abhinavsingh commented 1 year ago

@abravalheri I apologise for missing out on it. I checked our codebase today and we are still using "setuptools <= 62.6.0", which is not good in the long run (setuptools is now on v67). Likely I simply missed the notifications but today's activity has brought me back here.

I really want to get past this pinned setuptools issue in our codebase. Will investigate back into it over the coming weekend and come up with a minimal example to reproduce the issue at your end.

pankajkoti commented 1 year ago

hi team, just checking my luck, by any chance, do we have an update on this issue? I am still facing issues with editable installs in a local virtualenv for Apache Airflow.

abravalheri commented 1 year ago

Hi @pankajkoti to keep investigating this issue we need a reproducer of the original problem.

I reported some thoughts in https://github.com/pypa/setuptools/issues/3548#issuecomment-1221366695 (please note the remarks about entries in sys.path).

abravalheri commented 1 year ago

note that after the first execution of our script we create a folder with the same name as our project (so the first execution works). However, the second execution of the script fails, since the importer "finds" the package in the current working directory i.e. the newly created folder.

Hi @kptkin, in our docs we list the limitation of coinciding names of file entries in the working directory. If you need to support that use case you can try other installation modes, e.g. pip install -e . --config-settings editable_mode=strict.

kptkin commented 1 year ago

note that after the first execution of our script we create a folder with the same name as our project (so the first execution works). However, the second execution of the script fails, since the importer "finds" the package in the current working directory i.e. the newly created folder.

Hi @kptkin, in our docs we list the limitation of coinciding names of file entries in the working directory. If you need to support that use case you can try other installation modes, e.g. pip install -e . --config-settings editable_mode=strict.

Ah thanks for the reply! Yeah, I found this exact flag in the docs, good to have a verification that this is the recommended approach and it won't be something that is/should be fixed.

marcospgp commented 8 months ago

Just ran into this issue after updating to python 3.12, should I just revert to 3.11 and wait or is there a different approach I should take?

HexDecimal commented 8 months ago

If you're affected by this issue to the point where even editable_mode=strict won't work then the Legacy Behavior can still be enabled as a temporary workaround:

pip install -e . --config-settings editable_mode=compat

This is not a long term solution. If any of your tools requires editable_mode= to work correctly then it should probably be reported as an issue to that tools repo.

marcospgp commented 8 months ago

@HexDecimal What if it's our tool having the issue? How should it be updated to conform to editable installs in python 3.12?

HexDecimal commented 8 months ago

@HexDecimal What if it's our tool having the issue? How should it be updated to conform to editable installs in python 3.12?

This issue is not caused by your Python version. It is caused by installing setuptools packages in editable mode using the latest version of setuptools.

I've only experienced this issue from the perspective of a tool user (VSCode/Pylance) so I'm not sure what to do on the development side.

rachtsingh commented 3 months ago

For what it's worth, installing packages using pip install --editable . was adding significant (in this case 0.225s) lag to Python startup time. Happy to provide more information / a reproducer if that's useful (time spent was split across importlib.abc, subprocess, ast, and json in that order). I was a little disappointed that starting Python now required launching a new subprocess, but maybe I misunderstood (or misunderstand why that's necessary).

abravalheri commented 3 months ago

@rachtsingh performance degradation using editable mode can be caused by a myriad of elements many of them not under the control of setuptools (e.g. depending on how the particular package you are installing is implemented, which is the layout of that package, how many items you have in the .pth file, etc...).

Setuptools follows the implementation options described in https://peps.python.org/pep-0660/#what-to-put-in-the-wheel, and in general the performance hits associated with these implementation options are considered acceptable (the purpose of the editable mode is to help in development time and not meant to be used in production).

If you manage to investigate more and isolate a specific behaviour in setuptools that you think can be optimised, PRs would be very welcome.

dwt commented 2 months ago

I'm seeing an issue that might be related to this, for reproduction I've created this Dockerfile

Relevant output:

#21 0.513 _NamespacePath(['/home/zope/venv/lib/python3.12/site-packages/Products', 'Products', '__editable__.ZMS-5.2.0.finder.__path_hook__', '/home/zope/venv/src/products-zms-skins/Products', '/home/zope/venv/src/zms/Products'])
## [...] output from iter_modules which is missing `Products.zms`
#21 0.513 _NamespacePath(['/home/zope/venv/lib/python3.12/site-packages/Products', 'Products', '__editable__.ZMS-5.2.0.finder.__path_hook__'])
#21 0.513 __REGISTRY__['confdict'] None
#21 0.513 ['/home/zope/venv/src/zms/Products/zms']

To me this shows that after installation the namespace Package Products is initialized i think correctly, i.e. it contains the ZMS package, but after the first time pkgutil.iter_modules() is called that is gone.

So, the package Products.zms is not part of the output, even though it is correctly installed and importable. Why is that? Am I missing something about how installation with setuptools is supposed to work?

Further if I install Zope (as of yet a package that still does not have a pyproject.toml) in editable mode beforehand everything works as expected. I think because this forces a legacy mode in pip/setuptools?

Still this smells fishy enough that I would like some advice if this is a problem in setuptools, or python - or just in my code (still the most likely scenario I think).

abravalheri commented 2 months ago

@dwt pkgutil.iter_modules() docs say:

Note: Only works for a finder which defines an iter_modules() method. This interface is non-standard, so the module also provides implementations for importlib.machinery.FileFinder and zipimport.zipimporter.

Maybe it is related to that?

dwt commented 1 month ago

@abravalheri How do you read that notice in the docs? At first I read this as: "the module pkgutil.iter_modules() also provides implementation for importlib.machinery.FileFinder and zipimport.zipimporter". This could make some sense if it actually meant that only import lib provides the machinery to iterate over modules?

That leaves behind the question: How am I supposed to iterate over submodules of a namespace package if they are installed with setuptools in the new mode? pkgutil.iter_modules() seems to be what the python packaging documentation seems to recommend, so I gathered it should work?

abravalheri commented 1 month ago

My interpretation of that passage in the Python docs is: pkgutil.iter_modules() relies on the implementation of a non-standard iter_modules() method. Then it special cases the implementation to handle importlib.machinery.FileFinder and zipimport.zipimporter even when these don't implement iter_modules().

How am I supposed to iterate over submodules of a namespace package if they are installed with setuptools in the new mode?

This might be a good question for CPython?

If you would like to propose a PR in setuptools to implement this non standard iter_modules(), that would be an option to consider (although I would like to avoid increasing the complexity of the editable finder -- which is already very high).

Another possibility is to always use src-layouts for the projects that need to be discovered that way, OR install them using --config-settings editable_mode=strict/compat.

The good part ("glass half-full" point of view) is that editable installs are not really meant to be used in production, only for development. So in production you should not reach these code paths as packages are installed regularly.

dwt commented 1 month ago

I assumed that setuptools has a some test cases that show that editable installs work. Can someone point to me to where they are? I think a pull request that adds an editable install for a namespace package and then follows the recommended way to iterate it seems like a sensible first step?

dwt commented 1 month ago

My interpretation of that passage in the Python docs is: pkgutil.iter_modules() relies on the implementation of a non-standard iter_modules() method. Then it special cases the implementation to handle importlib.machinery.FileFinder and zipimport.zipimporter even when these don't implement iter_modules().

How am I supposed to iterate over submodules of a namespace package if they are installed with setuptools in the new mode?

This might be a good question for CPython?

Uhm... shouldn't the question be to setuptools? I mean, the documented way is right there, linked above?

If not, where would I ask this? Also, it may be way stronger a question if you or one of the setuptools contributors asks this, as I may just be ignored.

If you would like to propose a PR in setuptools to implement this non standard iter_modules(), that would be an option to consider (although I would like to avoid increasing the complexity of the editable finder -- which is already very high).

I would like to help review this, but I feel unable to judge wether that is the best way forward?

Another possibility is to always use src-layouts for the projects that need to be discovered that way, OR install them using --config-settings editable_mode=strict/compat.

I thought that my project does have the source layout. How would I check this?

The good part ("glass half-full" point of view) is that editable installs are not really meant to be used in production, only for development. So in production you should not reach these code paths as packages are installed regularly.

While that is true, they are also essential to (my) development workflow, either local or in docker containers, to install the module under development into a virtualenv and work on it there.

abravalheri commented 1 month ago

Uhm... shouldn't the question be to setuptools?

pkgutil is not maintained in the setuptools repo, so if you have a question on how that works, probably you are better speaking with the maintainers of that module. The "importlib machinery" in Python import is equally not maintained by setuptoools.

Setuptools itself also does not expose any APIs (or at least not that have not been deprecated) for iterating over namespaces.

So maybe you are looking for a feature that does not exist. If pkgutil.iter_modules implementation relies entirely on non-standard methods and special-casing, it is probably best to not take it as fully reliable/bulletproof. It likely works "to some extent" but does not fully interoperate with other importlib machinery standards, being provided "as is".

The best discovery mechanism known to work with third-party packages is entry-points, although even that has its limitations with editable installs (i.e. newly added entry-point need a full re-install).

I thought that my project does have the source layout. How would I check this?

This is setuptools' description of what a src-layout entails: https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#src-layout.

Note that if you "remap" or "customise" package directories via the package-dir configuration in setuptools, the repository no longer will be considered under a src-layout.

When installed in editable mode, a src-layout project will add a simple and static .pth file into the site-packages directory, simply augmenting sys.path and leaving Python's "native import machinery" to do the rest. This provides the best interoperability. In theory you could also use strict editable installs, but I don't think that will work well with docker containers because it uses links.

While that is true, they are also essential to (my) development workflow, either local or in docker containers, to install the module under development into a virtualenv and work on it there.

That is a fair comment, but it is also fair to recognise that editable installations are not a silver bullet and have many problems and limitations. This is made explicit in the setuptools docs.

I would like to help review this, but I feel unable to judge wether that is the best way forward?

In my option, the best way forward if you would like support for pkgutil.iter_modules to be implemented would be to create a PR (ideally not a complex change). A good place to start would be having a look on the docs about Python's importlib machinery and PEP 660. After that it is nice to have a look on https://setuptools.pypa.io/en/latest/userguide/development_mode.html and https://setuptools.pypa.io/en/latest/userguide/extension.html. Finally you can read through the implementation in https://github.com/pypa/setuptools/blob/main/setuptools/command/editable_wheel.py where the changes would need to be made.

Please note however that this territory is "muddy" and that there are not many answers out there for the problems of editable installs and namespace packages:

Note that it was also very difficult to reach a general agreement in terms of standards for editable installs (to be honest, I don't think the community never reached one), and one of the main issues is that there is no solution that solves all problems:

In general, the understanding is that editable installs are inherently "flawed", because it is currently impossible to have something to behave at the same time as if it was installed and in development at the same time, without constantly reinstalling it. Since the mileage may vary according to project layouts, usage of namespaces, etc..., users may have to choose the best installation procedure and/or project layout and compromise in a case by case basis.

dwt commented 1 month ago

That... sounds like a lot of work. I was hoping that we could agree that this is a backwards incompatibility issue. I.e. in the past editable installs did work even for namespace packages with setuptools, and thus this should continue to work.

If I understand you correctly, adding pull request here for itermodules() could solve that particular problem?

Just so I'm not seeing things: pkgutil is part of the python standard library, right? So that should make it the authority on how to iterate over submodules?

In this case, I cannot choose a different way to install, as the Zope-Maintainers (in their infinite wisdom) have decided that they require namespace packages for extensions. :-/

abravalheri commented 1 month ago

That... sounds like a lot of work. I was hoping that we could agree that this is a backwards incompatibility issue. I.e. in the past editable installs did work even for namespace packages with setuptools, and thus this should continue to work.

I don't think we can apply the concept of backwards incompatibility in this scenario because the brutal change in behaviour is motivated by a PEP, which basically made the exact previous implementation no longer viable. Also the previous behaviour and API still exists in setuptools, unchanged and bug-by-bug backwards compatible (although deprecated). Installers no longer call that API by default, though.

Just so I'm not seeing things: pkgutil is part of the python standard library, right? So that should make it the authority on how to iterate over submodules?

I don't think there is a standard about iterating over submodules. The pkgutil documentation clearly states its "non-standardness" and explicitly says it "only works" in some scenarios.

In this case, I cannot choose a different way to install, as the Zope-Maintainers (in their infinite wisdom) have decided that they require namespace packages for extensions. :-/

If you are writing a package, you can still chose its directory structure right? So you can use a src-layout in that scenario. Also you are the one installing the packages, right? So I suppose you can also choose with arguments to pass to pip install (which can include --config-settings editable_mode=...). That might be your escape hatch.

pfmoore commented 1 month ago

Please note however that this territory is "muddy" and that there are not many answers out there for the problems of editable installs and namespace packages:

To be completely clear here, it is 100% possible to implement the "legacy" mechanism that setuptools used in the past for editable installs. I believe this mechanism is now available in setuptools via the editable_mode=compat config setting.

The difficulties I have seen raised with editable mode have typically been either limitations of alternative implementation methods, bugs that were also present in the legacy mechanism, or feature requests for capabilities that were mot present in the legacy mode. While it would be nice to have these improvements, I don't believe the community has ever managed to come up with a consistent, clearly specified, and implementable of what such an "improved editable mode" would look like.

If the problems reported here are ones that did not exist before setuptools added PEP 660 support, and they can't be fixed by specifying editable_mode=compat, then I would say that they are a regression in the compatibility mode for editable installs. In all other cases, they are feature requests that may or may not be possible without changes to the core Python import machinery and/or the editable install specification.

The current state of the art is that all editable install mechanisms have limitations, as @abravalheri says. Setuptools gives you a choice of which limitations you prefer (most other build backends just pick one implementation technique and accept its limitations) but that's about the best you can hope for as things stand.

@dwt does editable_mode=compat fix your issue?

abravalheri commented 1 month ago

To be completely clear here, it is 100% possible to implement the "legacy" mechanism that setuptools used.

Hi @pfmoore, to be frank I think that compat is merely "compatible" (optimistically speaking) with the legacy mechanism that setuptools used in the past, but not 100% identical (Setuptools used complicated things like .egg-link and importlib hacks and it apparently involved reading/modifying stuff in site-packages). I don't think that the exact old approach is viable to implement given the current standard due to the decoupling of editable wheels and where they are going to be installed.

pfmoore commented 1 month ago

I think that compat is merely "compatible" (optimistically speaking) with the legacy mechanism that setuptools used in the past, but not 100% identical

OK, I wasn't aware of that. I'm disappointed that it was never raised in the PEP 660 discussions - we repeatedly characterised the existing mechanism as "just add a .pth file" and no-one contradicted that description. But that's water under the bridge now, the discussion was what informed the final specification and the various flaws in that discussion aren't something I have the appetite for revisiting (unless someone is willing to write a new PEP and manage a new discussion that way).

I'll retract the statement that editable_mode=compat not working like legacy editable installs is a regression. But it remains the closest we have to "how things used to work", and should probably still be the first thing to check if something has stopped working.

abravalheri commented 1 month ago

With the risk of sounding exaustive, I also would like to point out another thing that may have remained unsaid in this recent discussion, but is covered in the docs:

Legacy namespaces are not guaranteed to work in the current implementation of editable installs in setuptools.

This change is intentional, may be backwards incompatible, but I don't think there is much to do about it due to the reasons listed in the previous comment.

Support for namespace packages has been deprecated by a couple of years now and users are prompted to switch to native namespaces (and for better compatibility users are advised to use the src-layout with them).

abravalheri commented 1 month ago

I'm disappointed that it was never raised in the PEP 660 discussions - we repeatedly characterised the existing mechanism as \"just add a .pth file\" and no-one contradicted that description.

@pfmoore, that is also something that I learned/guessed after trying to "reverse engineer" the old installation when implementing PEP 660. To be honest, probably only the person first coming up with the .egg-link approach would be able to know that. The discussions were done in good faith and based on the best available knowledge.

We may revisit editable in the future based on the new knowledge that was gained via feedback, but that should be done in a "let's look forward" way. We should probably wait for input from core devs on namespaces (and other issues open in the cpython repo) first, though.

pfmoore commented 1 month ago

We may revisit editable in the future based on the new knowledge that was gained via feedback, but that should be done in a "let's look forward" way. We should probably wait for input from core devs on namespaces (and other issues open in the cpython repo) first, though.

+1. My personal view is that what we need to do in order to improve editable support is:

Get an accepted definition of what precisely is meant by an "editable install".
Ensure (via discussion, PRs, and where necessary, PEPs) that the core Python import system is able to support all the capabilities needed for that definition of an editable install (this includes things like importlib.metadata, and anything needed for typing).
Create a mechanism that implements this definition. This could be done by each backend individually, or by a common project like editables. That depends on how successful we are getting consensus on the definition in (1)...

When I accepted PEP 660, my view was that for (1) "what you get creating a .pth file pointing to a src distribution" was the community's accepted minimum spec for an editable install, and (2) and (3) followed from that (with the additional benefit that backends like setuptools could experiment with producing more capable editable installs). Where we are now, it seems like we have established that there is a desire for a better "minimum standard" and we're starting to get an idea of what core Python capabilities that will need. So we're making progress - slowly, like with everything in packaging, but steadily.

dwt commented 1 month ago

This discussion about what to do in the wider python ecosystem is probably necessary, but I would like to come back to my initial observation once more to make sure we are not missing a simple bug.

What I see there is that it seems that Products.__path__ (Product is the namespace package in this case) seems to be correct after python startup. It contains some extra items that seem to be correlated to how setuptools integrates itself, but also the real names of all the packages that are installed. That seems like it would work for tools like Zope, which just iterate that __path__ directly and filter out anything that doesn't exist in the filesystem.

However, after calling list(pkgutil.iter_modules(Products.__path__)) once, some of these entries are gone.

This seems to be part of the problem and something I might be able to contribute a fix towards.

But to do that, I need to understand: Is that actually a problem? Is there a reason why it would be 'correct' to remove these extra entries? If so, why?

dwt commented 3 weeks ago

ping? Could any of the setuptools maintainer give me a heads here please?

abravalheri commented 3 weeks ago

@dwt is there any PR you would like to propose/implement?

As previously noted we don't control the implementation of pkgutil, so the best we can do is to comply with the docs/specifications of the Python import system as defined in https://docs.python.org/3/reference/import.html, which indeed I believe we follow.

Otherwise please consider using a different installation mode (e.g. --config-settings editable_mode=strict or --config-settings editable_mode=compat) or migrating to a src-layout or not using editable installs at all, as they are not guaranteed to always work as stated in the docs.

pypa / setuptools