pypa / setuptools

Official project repository for the Setuptools build system
https://pypi.org/project/setuptools/
MIT License
2.48k stars 1.18k forks source link

Due to limitation, static analysis don't support import hook used in editable install v64+ #3518

Open JacobHayes opened 2 years ago

JacobHayes commented 2 years ago

setuptools version

setuptools==64.0.2

Python version

3.10

OS

macOS, Linux

Additional environment information

No response

Description

After installing editable packages (with a pyproject.toml) with setuptools >=64, mypy is no longer able to find the py.typed files when checking a codebase importing/referencing the editable installed package.

This was opened as a mypy issue in https://github.com/python/mypy/issues/13392, but they recommended we open an issue here. I'm guessing there will be some further discussion on both sides, but just getting the ball rolling.

Expected behavior

Editable installs are still discoverable by static type checkers.

How to Reproduce

See the README in https://github.com/JacobHayes/editable-install-hooks-repro

Output

$ mypy --namespace-packages --explicit-package-bases src
src/org/pkg2/__init__.py:1: error: Cannot find implementation or library stub for module named "org.pkg1"
src/org/pkg2/__init__.py:1: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
Found 1 error in 1 file (checked 1 source file)
$ pylint src/
************* Module org.pkg2
src/org/pkg2/__init__.py:2:0: E0611: No name 'pkg1' in module 'org' (no-name-in-module)

-----------------------------------
Your code has been rated at 0.00/10
abravalheri commented 2 years ago

Hi @JacobHayes thank you very much for bringing this topic up.

A different way of describing the issue would be stating that tools doing static analysis do not support import hooks (which is a very fair limitation, it makes total sense). Please note that this is not specific to setuptools. For example other backend using the editables might also rely on import hooks. I would also not classify it as a bug.

The links you shared in your example repository are very relevant (specially the email thread). In addition to that I would also mention:

Looking on a way forward on how to handle these circumstances, I believe that first we need to design a solution having inputs and buy-in from both sides (packaging and static analysis).

One of the solutions proposed in the email thread (a .json file with metadata pointing out the locations for each package) seems to be a good start point. I am very happy to support a similar solution that would enhance the experience with import hooks.

To pursue this topic further in the setuptools side we would need consensus on such mechanism. This should probably come as a packaging PEP.

I don't know if the setuptools issue tracker is the right forum for proceeding with this discussion, since there is no clear action point for setuptools to take. I recommend anyone interested in such a feature to engage in the mailing list discussion, or on the thread on discourse. We probably need a champion to push it forward :)

For the time being, if you need to integrate type checking and editable installations, you can try out the strict mode, or if you own the project to swap to a src-layout.

JacobHayes commented 2 years ago

Why move forward with the install hooks if it was known there was no consensus yet? setuptools (v64 and used ~everywhere) seems quite different than editables (v0.3 and used in a couple spots / where?).

try out the strict mode

I'll test this out within the next couple days. Is there any reason this isn't the default, to reduce the number of breaking workflows by default?

if you own the project to swap to a src-layout.

Can you elaborate on this? I think the example repro repo I linked is in src layout, but still does not work. Is there a part I'm missing there?

--

Thanks for your work on setuptools, I know packages like this in particular are in a hard spot in regards to feature evolution.

abravalheri commented 2 years ago

Why move forward with the install hooks if it was known there was no consensus yet? setuptools (v64 and used ~everywhere) seems quite different than editables (v0.3 and used in a couple spots / where?).

There is consensus in terms of packaging. This is expressed in terms of PEP 660, which was debated over and over again in the community. The PEP's text recognizes import hooks as perfectly valid implementation strategy. Import hooks have been part of Python's feature list for a long time. There are even other built-in behaviours (e.g. zip imports) that rely on these dynamics. Just because they are features that static analysis tools have problems to deal with, it does not mean that they are invalid.

As an anecdote: hasattr() is an extremely useful feature of Python and until the other day, code analysers/type checkers had problems to deal with it. It does not mean that everyone should stop using it...

The lack of consensus that I describe is in what Python's package ecosystem can do to make it easier for static analysis tools to deal such kind of implementation.

Is there any reason this isn't the default, to reduce the number of breaking workflows by default?

The strict mode potentially is not what users expect to find by default. When using editable installs, users expect to be able to create new files, or delete old ones and have them automatically picked up without requiring a new installation. The strict mode does not offer that capability.

To offer a "minimal surprise" experience, the default editable installation tries to be a bit more "lenient"...

Can you elaborate on this? I think the example repro repo I linked is in src layout, but still does not work. Is there a part I'm missing there?

It is very definitely almost what I would expect for a src-layout... However, there is a very subtle customization that you are passing to setuptools, which requires special ad-hoc handling.

When you type find_namespace_packages(include=["org.*"], ...), you are effectively instructing setuptools to take any sub-package of org (e.g. org.pkg1 or org.pkg2) from your project directory and put that in the distribution, but not the org package itself (org does not match org.*). This makes setuptools' life harder, since it has to guarantee that the immediate contents of the org folder other than subpackages like pkg1 or pkg2 don't end up in the distribution by mistake. The way setuptools replicates this behaviour in the editable installation is by using import hooks (or file links in the case of the strict mode).

The following change may make what you want possible:

diff --git i/pkg1/setup.py w/pkg1/setup.py
index 406504e..5097881 100644
--- i/pkg1/setup.py
+++ w/pkg1/setup.py
@@ -3,7 +3,7 @@ from setuptools import find_namespace_packages, setup
 setup(
     name="org-pkg1",
     version="0.0.1",
-    packages=find_namespace_packages(include=["org.*"], where="src"),
+    packages=find_namespace_packages(include=["org*"], where="src"),
     package_dir={"": "src"},
     package_data={"org.pkg1": ["py.typed"]},
     python_requires=">=3.9",
diff --git i/pkg2/setup.py w/pkg2/setup.py
index 5ee4648..a48c94f 100644
--- i/pkg2/setup.py
+++ w/pkg2/setup.py
@@ -3,7 +3,7 @@ from setuptools import find_namespace_packages, setup
 setup(
     name="org-pkg2",
     version="0.0.1",
-    packages=find_namespace_packages(include=["org.*"], where="src"),
+    packages=find_namespace_packages(include=["org*"], where="src"),
     package_dir={"": "src"},
     package_data={"org.pkg2": ["py.typed"]},
     python_requires=">=3.9",

(or just removing include, or just removing packages= entirely and relying on auto-discovery, or using include=["org", "org.*"]).

ofek commented 2 years ago

For example other backend using the editables might also rely on import hooks. I would also not classify it as a bug.

Hatch does not by default for this reason https://hatch.pypa.io/latest/config/build/#dev-mode

abravalheri commented 2 years ago

Hatch does not by default for this reason https://hatch.pypa.io/latest/config/build/#dev-mode

If that is the reason, does it mean that you would also be supportive of the solution proposed in the email thread (a JSON file that would support static analysis tools to find the modules handled by import hooks)?

ofek commented 2 years ago

Sure 🙂

DanielNoord commented 2 years ago

We're running into similar issues with pylint although I'm not sure it is exactly the same issue. We do support import hooks and try to resolve those so it might be better to open a standalone issue for our problem. If preferred, please let me know.

A small summary: pylint uses astroid to generate the ast tree of python files with some additional nice-to-haves to help use perform the analysis. astroid tries to mimc Python import behaviour with its interpreter._import module: https://github.com/PyCQA/astroid/tree/d033fe2f4c575d45a1706c52988dbcdf50645c5c/astroid/interpreter/_import On setuptools<63 we were able to discover editable installs through sys.path and looking for the __init__.py file. The code that is responsible for this is: https://github.com/PyCQA/astroid/blob/d033fe2f4c575d45a1706c52988dbcdf50645c5c/astroid/interpreter/_import/spec.py#L135-L145

This find the package and returns a ModuleSpec which resembles a importlib.ModuleSpec.

On setuptools>=64 this no longer works. I assume it is because there is a difference in the effect pip install -e has on sys.path but I'm not sure. Does anybody who is knowledgable about the new editable system see how we should resolve imports to editable packages?

The pylint issue that tracks this is https://github.com/PyCQA/pylint/issues/7306. Feel free to continue the discussion there if you want to keep this focused on the original issue.

abravalheri commented 2 years ago

Hi @DanielNoord thank you very much for reaching out.

it might be better to open a standalone issue for our problem. If preferred, please let me know.

Please feel free to open a new issue if you can isolate the problematic behaviour and you think it is different than the one discussed here.

We do support import hooks

Does it mean that pylint will run the import hooks installed by setuptools (e.g. _EditableFinder and _EditableNamespaceFinder) when trying to resolve imports?

On setuptools<63 we were able to discover editable installs through sys.path and looking for the __init__.py file. The code that is responsible for this is: https://github.com/PyCQA/astroid/blob/d033fe2f4c575d45a1706c52988dbcdf50645c5c/astroid/interpreter/_import/spec.py#L135-L145

This find the package and returns a ModuleSpec which resembles a importlib.ModuleSpec.

The link seems to point out that previously you were using a ImportlibFinder instead of the importlib.machinery.PathFinder provided in the standard library. This should work for any static .pth file in the site directory I suppose...

However going forward in version > 65 the .pth files added to the site directory can be dynamic (and install custom import hooks). I would expect that finders implemented by setuptools (e.g. _EditableFinder and _EditableNamespaceFinder) would be the ones to find the modules instead of pylint's ImportlibFinder (which is fine: according to the Python docs the finders would run in sequence until one of them finds the module...)

As you confirmed that pylint does support custom import hooks, my interpretation is that pylint runs the custom finders and used the ModuleSpec objects that setuptools is returning. Now I don't understand why this circumstance would be problematic. Could you please create a reproducer showing pylint calling the custom finders installed by setuptools but resulting in problems?

DanielNoord commented 2 years ago

We do support import hooks

Does it mean that pylint will run the import hooks installed by setuptools (e.g. _EditableFinder and _EditableNamespaceFinder) when trying to resolve imports?

I think we technically could, but are not doing so right now. Do these import hooks still work on the newer versions? That might be something to explore for us.

However going forward in version > 65 the .pth files added to the site directory can be dynamic (and install custom import hooks). I would expect that finders implemented by setuptools (e.g. _EditableFinder and _EditableNamespaceFinder) would be the ones to find the modules instead of pylint's ImportlibFinder (which is fine: according to the Python docs the finders would run in sequence until one of them finds the module...)

Yeah, this is what we do as well. Although I don't think our support for custom import hooks is complete yet. I'll try to see if I can get it to work with the ones you are referring to.

Edit: Sorry for the confusion. Seems like we did not support custom import hooks, I was messing up some terms.

I have created a PR that does support them and correctly finds the new editable packages. However, since other linters are seemingly hesitant to add support for custom import hooks I'm wary of supporting them in pylint and astroid right away. I guess we need to do some analysis about the potential effect of this support, but I'm not sure how to do that most effectively.

Anyway, I guess for pylint this issues seems resolvable or at least our case doesn't add much more to the general problem static analysers are having.

abravalheri commented 2 years ago

Edit: Sorry for the confusion. Seems like we did not support custom import hooks, I was messing up some terms.

No problems, this is one of the most confusing aspect of Python 🤣

I suppose this means that this issue already captures the challenges with pylint interoperability and we don't need to create a new one.

I have created a PR that does support them and correctly finds the new editable packages. However, since other linters are seemingly hesitant to add support for custom import hooks I'm wary of supporting them in pylint and astroid right away.

In the mailing list pointed out by @JacobHayes, there is a suggestion that could work as a more long-term/stable approach (probably as a packaging standard/PEP). In summary the suggestion is to have a JSON file placed somewhere in the site-packages directory, containing static paths and metadata for modules in the editable installation.

Do you think that would be a better (or more acceptable) approach for pylint?

I have been a bit short of time lately, but I think I could work with that... I also think that other people in PyPA would be supportive of this approach.


Meanwhile, this information may be relevant: setuptools will try to use static .pth files as much as possible. Projects that use the so called src-layout should be available without the need of import hooks.

If users want to use flat-layout, they can circumvent the limitations by using the strict mode.

DanielNoord commented 2 years ago

Do you think that would be a better (or more acceptable) approach for pylint?

If that works for other tools it should for us as well. Pylint is notoriously bad with importing/discovering packages, especially when namespace packages are involved. Since we are wary of potential security risks we are always hesitant to change stuff, especially since some of the code was written 12 years ago by people we no longer know.

That said, pylint strives to replicate normal import behaviour so I think eventually we would always try and support custom import hooks and look at such a JSON file as a last resort.

hmc-cs-mdrissi commented 2 years ago

Can we specify more what .pth file lacks that json approach would support? Is the main one being able to import/include specific files and not whole directories? Main question is a new source of import information easier tooling change then a change to .pth file format?

abravalheri commented 2 years ago

Can we specify more what .pth file lacks that json approach would support?

In terms of packaging .pth files are considered a flawed approach when the package follows the so called flat-layout. For this kind of layout, a .pth file ends up "installing" things that are not supposed to be installed. In certain cases they also make things harder to debug, because the user might not realise certain files are not included by default in the distribution (setuptools offers a strict mode now that is very helpful on these scenarios and should interoperate with static analysis).

The proposed JSON file would list the corresponding paths for specific packages/modules, so static analysis tools could inspect them without resorting to dynamic code evaluation. This is not directly comparable to a .pth file, since the .pth file only adds entries to sys.path.

Main question is a new source of import information easier tooling change then a change to .pth file format?

I believe so. Changing how .pth files involves creating potentially non-backwards modifications to Python's importlib machinery. Adding a new source of information is something that can be sorted out in the tooling level, while preserving the backwards behaviour of .pth file.

debonte commented 1 year ago

Hatch does not by default for this reason https://hatch.pypa.io/latest/config/build/#dev-mode

pdm also requires users to explicitly opt-in to the import hook approach -- https://pdm.fming.dev/latest/pyproject/build/#editable-build-backend

Maybe setuptools should do something similar? Solving this issue "the right way" (ex. new file consumed by tools) is going to take a while (many months). Requiring users to opt-in could mitigate the problem in the meantime.

abravalheri commented 1 year ago

Maybe setuptools should do something similar?

When we analyse the problem from the optics of type-checking this seems reasonable.

However there are other things in play: there is a series of setuptools issues that cannot be fixed with the previous implementation. Moreover, import-hooks help us to that match more closely a regular installation, and this is also very helpful to identify bugs before packages are published.

The TLDR is that an explicit opt-in mitigates some problems but re-open other ones.

Please also note that there is already a series of compromises and trade-offs in place: whenever "safe" setuptools will not use import-hooks.

There is also quite a few scape hatches for the case users are more concerned with type-checking than the other packaging problems, so they can "opt-out".

gramster commented 1 year ago

It's worth noting that 'static analysis tools' also include the tools used to provide rich editing experiences in modern editors; in particular, type inference and thus auto-completion suggestions and hover information rely heavily on static analysis. This goes way beyond type checkers.

abravalheri commented 1 year ago

Hi @gramster, please note that this fact that you are describing is well-known and acknowledged. In my previous comments (e.g., https://github.com/pypa/setuptools/issues/3518#issuecomment-1214137772, https://github.com/pypa/setuptools/issues/3518#issuecomment-1214194979, etc...), I always take care to write "static analysis tools" instead of type-checking.

In the https://github.com/pypa/setuptools/issues/3518#issuecomment-1262076592 message I am deliberately addressing type-checking in particular, because there are other kinds of static analysis that (in some particular scenarios) that could incur in false positives/negatives if the previous implementation is used.

Just as an example/anecdote, in the case configurations that uses an incorrect packages=find_packages(...) statement, some top-level files/packages might be omitted from the wheel. Static analysis tools that rely on .pth files would not be able to identify missing files/packages.

debonte commented 1 year ago

there is already a series of compromises and trade-offs in place: whenever "safe" setuptools will not use import-hooks.

Right. I was really happy to see that in the code. Do you happen to have telemetry (or other data) that would tell us how many users are using each packaging strategy?

There is also quite a few scape hatches for the case users are more concerned with type-checking than the other packaging problems, so they can "opt-out".

I noticed the following language in the Legacy Behavior docs: If...you wish to replicate the legacy behavior, for the time being you can also perform the installation in the compat mode

What is the timeframe for removing compat mode?

abravalheri commented 1 year ago

Hi @debonte.

Do you happen to have telemetry (or other data) that would tell us how many users are using each packaging strategy?

I am afraid setuptools does not collect telemetry. You might be able to achieve some rough estimation via PyPI's datasets and inspector, although this method will not cover private packages or reflect 100% on the real world usage.

What is the timeframe for removing compat mode?

So far, the end of the year has been stablished as "grace-period" for the users to adapt to the PEP 660 implementation. But as commented in https://discuss.python.org/t/help-testing-pep-660-support-in-setuptools/16904/61, if anyone is strongly interested in pushing this further or keeping it permanently, I suggest contacting the other setuptools maintainers to gather support. I don't have strong feelings about this, but I also don't want to unilaterally decide to keep extra code paths in setuptools, which impose maintenance burden.

Please also note that users should be able to run static code analysis with the strict mode. My intention is also to keep SETUPTOOLS_ENABLE_FEATURES="legacy-editable" for as long as pip allows legacy installs.

hauntsaninja commented 1 year ago

Is there a way to achieve --config-settings editable_mode=compat via an environment variable?

cswartzvi commented 1 year ago

Hi @abravalheri,

So far, the end of the year has been stablished as "grace-period" for the users to adapt to the PEP 660 implementation. But as commented in https://discuss.python.org/t/help-testing-pep-660-support-in-setuptools/16904/61, if anyone is strongly interested in pushing this further or keeping it permanently, I suggest contacting the other setuptools maintainers to gather support. I don't have strong feelings about this, but I also don't want to unilaterally decide to keep extra code paths in setuptools, which impose maintenance burden.

Is there any update on the removal of compat mode? Has there been any decision on keeping it around permanently or has the "grace-period" simply been pushed back? Thanks!

abravalheri commented 1 year ago

Hi @cswartzvi it's been pushed back. I believe that compat is not the final solution for this problem, but I haven't had the time to investigate and propose something more permanent yet.

As a general concept I believe that it would be possible to come up with a static file (e.g. JSON) describing editable package locations that could at the same time feed static analysis tools and a MetaPathFinder/PathEntryFinder (I briefly mentioned it in https://github.com/pfmoore/editables/issues/21#issuecomment-1227059911). Of course, this needs a PoC implementation + a round of discussion in the community.

s-banach commented 1 year ago

I just came here to leave a comment on the language in https://setuptools.pypa.io/en/latest/userguide/development_mode.html

The way it's currently worded feels like "we either don't know or don't care about the issue with type checking". If the documentation is changed to say "compat is a temporary solution until we reach some agreement with the static type checkers on how to handle editable installs", people like me might not show up here to rant and rave.

cswartzvi commented 1 year ago

Thank you @abravalheri that is helpful - I will watch this space for updates.

Perhaps you can help me with a conceptual question (that also might be related to @s-banach's comment about the docs). What is the default editable install mode? I can see in the code that the default is called "lenient" but how does that differ from 'compat'? Forgive me if this is a naïve question, even though I have been using python for a long time this is still all very - confusing.