Open abravalheri opened 2 years ago
Hi @abravalheri
I actually think this is possibly simpler than you suggest - it looks basically like a simple case of having two different pkgA
modules on sys.path
- one the "real" module, the other the "accidental" namespace module created by using the package name as the name of the project directory. Which takes precendence is down to which one gets "found" first, which is essentially arbitrary.
I'd be tempted to say the simple solution here is "don't name your project directory the same as the package name in a monorepo". Does that fix the issue? If it doesn't, I'd be very curious as to why.
As regards this being a counter-argument for import hooks, I think it is (in the sense that it's a case where import hooks appear to have unexpected edge cases). I don't personally think that's a disaster, as I consider the .pth
approach as the "canonical" way of implementing editable installs, with other approaches still being experimental, and potentially having bugs. I know that the .pth
file approach also has limitations, but my view is that at least those limitations are well known, and known from experience to be tolerable.
I'd be happy to improve the hook-based approach, but I expect it will be a long time before I'd consider making it the recommended approach. I know I'm holding it to a higher standard than .pth
files, but I'm personally OK with that.
Thank you very much @pfmoore for the insights. I understand your view and the conclusions seem very reasonable.
At some point I want to experiment a bit more with importlib.abc.PathEntryFinder
rather than importlib.abc.MetaPathFinder
. I have the impression that PathEntryFinder
s are less subject to the namespace-related limitations than MetaPathFinder
s. Maybe I can use them to solve this other limitation too. I will keep you in the loop :)
Reopening as a reminder for myself. I do want to investigate this scenario more closely, and I think it's useful to have a discussion about import hooks and their limitations.
I think the key points for me at the moment are:
sys.path
locations, meaning that it's not easy to control shadowing in the case where the same name appears in more than one place in the import path. (This is exacerbated by the fact that implicit namespace packages mean that every directory is potentially a package).To be perfectly honest, I think that some of these issues are inherent in the design of the existing path hook mechanism[^1], and we're only hitting them now because outside of zipimport (which uses a path hook, not a meta hook) there's very little use of import hooks, so the edge cases haven't got exposed until now.
I wonder whether there is merit in looking at a mechanism similar to the existing .pth
file approach (but ideally implemented in the import machinery, not in site.py
) that allows declaratively exposing a much richer set of "overlays" on sys.path
. Something like the "virtual wheel" file idea from PEP 662, but implemented directly by the import machinery[^2], so that packaging tools just install the description file (and static analysis tools could use that file directly).
[^1]: And I say this as one of the authors of PEP 302 that introduced the model. [^2]: I say this with some reservations, as a lot of the reasons I rejected PEP 662 could still apply in that situation.
I wonder whether there is merit in looking at a mechanism similar to the existing .pth file approach (but ideally implemented in the import machinery, not in site.py) that allows declaratively exposing a much richer set of "overlays" on sys.path. Something like the "virtual wheel" file idea from PEP 662, but implemented directly by the import machinery2, so that packaging tools just install the description file (and static analysis tools could use that file directly).
The investigation that I had in mind was in this direction.
I was thinking about implementing a path hook that mimics the zipimport mechanism. I can imagine a __editable__.<package-name>.json
file being added to site-packages containing metadata to describe the contents of the package. This metadata could be used at the same time by an import hook and by static code analysis tools.
My hopes are that by using a PathEntryFinder
instead of a MetaPathFinder
we could surpass limitations 1 and 3. I was also hopping that the existence of a file containing a static description of the contents of the package would satisfy static analysis tools that cannot afford the dynamic nature of import hooks.
You may still have issues regarding (3), because the json
file would need to be added to sys.path
somewhere. To accurately match the "historical" behaviour here, maybe add an __editable__.<package-name>.pth
file containing the name of the .json
file alongside the json
file itself. Then the .pth
file will be processed "as normal" (i.e., "in the same way as the old setup.py develop
command worked") and add the .json
file to sys.path
.
But yes, I quite like this idea. And the fact that both you and I came up with something similar suggests that it has potential 🙂
I don't have much time at the moment to work on this, but yes, please, let me know how you get on (and if you need help, feel free to ask).
I've been looking at this issue again. I'm trying to understand what the actual problem is with namespace packages. The root of my confusion is that basically, a top level non-namespace package is (in effect) "owned" by a single project. So you can't have two projects "foo" and "bar" that both own the top-level name "foobar". Or rather, you can, but the packaging machinery doesn't protect you from any mess that you get in (you can install them both, but what you get is a mess with the two projects overwriting each other).
With that in mind, I'm embarrassed to admit that it's possible the "namespace issue" simply isn't an issue in practice, and I've been spreading FUD 🙁
I'm pretty sure it's possible to create a top-level namespace package using a path hook (I have a proof of concept, but it is very manual, so I won't post it here). That "solves" the problem of how to install, in editable mode, a namespace package. But that's not the way I see people typically discussing namespace packages - they seem to be looking more at something along the lines of installing a project mypkg
, where mypkg.plugins
is a namespace package that code in mypkg
will use. But that makes no sense to me, as it's not possible to create a project that installs something in mypkg.plugins
without "owning" the name mypkg
(assuming that mypkg
is a real package, not a namespace). So how do you distribute a plugin?
So in reality, I don't understand the actual use case for namespace packages in wheels.
@abravalheri I assume you have more experience with people actually wanting to use namespace packages in "real life". Can you explain to me what I've missed here?
Hi Paul, thank you very much for the project.
It was really important for me as a reference when implementing PEP 660 support in setuptools. I still plan to contribute to editables in terms of namespace handling once the details are flashed out (this may depend on the instructions we get in
https://github.com/python/cpython/issues/92054
).I am opening this issue to seek for advice/have a general discussion about using import hooks in editable installs.
Recently in the
setuptools
repository we got an issue reporting that the editable installation of a regular package (ie. non-namespace) don't behave in the same way as a normal installation. I understand thateditables
would also have the same problem. So I wanted to share with you this particular use case and check what are your thoughts.Problem description
Let's imagine a monorepo that contains multiple directories, each one of them corresponding to a Python project, like the following:
When the users perform a regular installation of one sub-directory, they are able to run scripts importing the installed package; even when the "current working directory (CWD)" is the root of the monorepo. This is demonstrated in the snippet bellow:
In normal conditions, "stuff" in the CWD should take precedence over "stuff" in site-packages. However, this rule does not seem to apply for namespace packages in CWD.
If I had to make a wild guess, I would say that the following is happening:
ModuleSpec
for the namespace in CWDModuleSpec
with a new (regular) one pointing to the directory with__init__.py
.When using a import hook/meta path finder, the behaviour differs, and the users are surprised with an error:
In this scenario the "accidental" namespace package created by having
""
onsys.path
takes precedence over the import hook. Since the objective of an editable install is to operate as close as possible to a regular install, this use case could be used as a counter argument for import hooks.