Open jaraco opened 1 month ago
In #35 and #36, I've drafted a couple of possible approaches.
I don't particularly understand the logic behind the "essential layout" - there's no pyproject.toml
shown in the layouts described, so I don't see the full picture here. But I think that's mostly irrelevant except as a motivating example.
The biggest issue with supporting (implicit) namespace packages is that they are documented as responding dynamically to changes to the filesystem content, in a way that import hooks cannot implement without replacing the whole filesystem loader from importlib (there's undocumented support code in importlib that provides essential functionality).
While it's possible to argue that omitting that dynamic behaviour is "good enough", I'm wary of getting sucked into such arguments - for many years, the .pth
file mechanism was "good enough", but now people say it's not, while not being able to precisely define what their new boundaries for "good enough" behaviour are[^1].
Answering your specific questions:
Is this use-case something that was considered and rejected?
Yes, because of the problem implementing the dynamic behaviour of implicit namespaces, as discussed above.
Do you have any specific goals that would prevent adding support to the Redirector (with or without support in EditableProject)?
The main one is "keep things simple". This project is not intended to be the last word in editable install support, it's intended as a simple, fairly straightforward implementation that can be easily used by build backends with a basic need for editable support.
A secondary answer is that I view the import hook approach as a "last resort", and strongly recommend using add_to_path
and add_to_subpackage
wherever possible. The map
method is very much "use at your own risk".
Is there another construct/approach that's meant to address this need?
The documentation has a whole section on supported use cases. I believe the section on "Project directory installed under an explicit package name" covers what you're talking about. If not, then I'm afraid I don't understand the specifics of your use case well enough.
I'm not flat-out rejecting this idea, but I'll need to understand why you need this specific solution before I'm willing to go down this route. In particular, I consider import hooks to be very much an "advanced" implementation technique, which should almost never be needed in practice - the stdlib approaches for manipulating sys.path
are (in my view) more than adequate for most realistic needs (assuming that people are looking for a practical solution, rather than some form of theoretical purity - which sadly doesn't always seem to be the case).
[^1]: Yes, I am bitter about that 🙁
The documentation has a whole section on supported use cases. I believe the section on "Project directory installed under an explicit package name" covers what you're talking about. If not, then I'm afraid I don't understand the specifics of your use case well enough.
I hadn't read the docs, but I read the implementation, and thus had already determined that the technique is non-viable for implicit namespace packages because of how it creates a __init__.py
(thereby breaking implicit namespace packages). If that namespace exists elsewhere on sys.path, adding an __init__.py
to one location will break discovery of the other location. To illustrate, consider the ns.pkgA
and ns.pkgB
packages in the implicit namespace ns
:
draft @ mkdir -p path1/ns/pkgA
draft @ mkdir -p path2/ns/pkgB
draft @ touch path1/ns/pkgA/__init__.py
draft @ touch path2/ns/pkgB/__init__.py
draft @ env PYTHONPATH=path1:path2 py -c "import ns.pkgA; import ns.pkgB" && echo 'works!'
works!
They're both present on different sys.path entries (as that's a key part of how these namespace packages work).
But now you want to editable-install pkgA using the add_to_subpackage
technique:
draft @ mkdir -p project/pkgA
draft @ touch project/pkgA/__init__.py
draft @ rm -r path1/ns/pkgA
draft @ cat >> path1/ns/__init__.py
__path__.append('project')
draft @ env PYTHONPATH=path1:path2 py -c "import ns.pkgA; import ns.pkgB" && echo 'works!'
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'ns.pkgB'
Now, although ns.pkgA
still imports fine, ns.pkgB
is missing. When the PathFinder
found path1/ns/__init__.py
, it created a simple package and stopped searching the path for more packages in the namespace. Editable-installing pkgA caused pkgB to go missing.
The aforementioned example shows a problem with the namespace in two different Python paths, but another problem arises even with everything installed on the same Python path. If ns.pkgA
and ns.pkgB
are both editable-installed to the same site-packages, only one of their ns/__init__.py
files will win and the losing package will be missing.
No, the __init__.py
is added to ns/pkgA
, not to ns
.
>>> from editables import EditableProject
>>> my_project = EditableProject("foo", "/path/to/foo")
>>> my_project.add_to_subpackage("ns.pkgA", "src/pkgA")
>>> list(my_project.files())
[('foo.pth', ''), ('ns/pkgA/__init__.py', "__path__ = ['C:\\\\path\\\\to\\\\foo\\\\src\\\\pkgA']")]
Your test is testing how you think the method works, rather than how it actually does work...
Your test is testing how you think the method works, rather than how it actually does work...
Wow! Okay. I had considered that possibility, but I was 95% sure that would not have the intended effect, that you could only cause additional submodules and subpackages to be added to a package, but you couldn't replace the package being imported. My suspicion was reinforced by reading the docs that say
For this use case, the
project.add_to_subpackage
method is available. This method creates themy.namespace
package (by installing an__init__.py
file for it into site-packages) and gives that package a__path__
attribute pointing to the source directory to be installed under that package name.
The fact that __init__.py
is installed for my.namespace
made me think that the namespace was getting the __init__.py
, not a package (or module?) in that namespace.
Attempting to confirm this approach, it doesn't seem to have the intended effect. Any behavior in src/pkgA/__init__.py
is missing.
draft @ mkdir -p ns/pkgA
draft @ cat >> ns/pkgA/__init__.py
__path__ = ['/Users/jaraco/draft/src/pkgA']
draft @ mkdir -p src/pkgA
draft @ cat > src/pkgA/__init__.py
val = 42
draft @ py -c "import ns.pkgA; print(ns.pkgA.val)"
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: module 'ns.pkgA' has no attribute 'val'
I confirmed that I am able to import other submodules of pkgA
, just not its root module __init__.py
:
draft [1] @ touch src/pkgA/foo.py
draft @ py -c "import ns.pkgA.foo" && echo 'works!'
works!
This is so tantalizingly close to what I need. I wonder if the add_to_subpackage
shouldn't add another line to the generated __init__.py
as (essentially) execfile(__path__[0] + '/__init__.py')
.
Is this ns.pkgA
example something you'd like EditableProject
to support, even if that means having to explicitly execute the root module?
Hmm, that's a good point. And the approach of reading and executing __init__.py
in the generated one is a plausible fix.
But what I really need here are some good, small, worked examples of why you want to do this. Not particularly in the context of your larger "essential layout", but just as a straightforward Python project. From my perspective, namespace packages over the years have always felt like they have suffered from a "solution looking for a problem" mindset - the best example of this is the dynamic path recalculation behaviour in the original PEP - the example in the PEP shows a situation where dynamic recalculation is needed and would work, but it doesn't give any real-world example of why you'd want to do this sort of runtime path manipulation. And in my experience, most packaging users never do need this sort of thing - sys.path
gets set up by the runtime, and never altered during program execution. So we have a complex, difficult-to-support mechanism, which in practice is never needed.
And in this case, the complexity is non-trivial - execfile
doesn't exist any more, so we need to read the file content (handling things like encoding declarations) and then exec
it (handling __future__
imports somehow), etc, etc. All of that is a source of potential bugs, and while it's true that we'll only get bug reports if the functionality is useful to people, I'd like to be sure, before we add the code, that there isn't an alternative of saying "why don't you do the following instead, which gives the same result?"
That means understanding why people want to do this in the first place. After all, the PyPA recommended project layout has the project files located in src
, and putting src
onto sys.path
by using a .pth
file "just works". So as far as I'm concerned, every use of editable installs that doesn't just use add_to_path
is at some level a hack to work around people not following the standard project layout recommendations. So it's important to understand why they aren't following the recommendation, and how far we can go with requiring them to change their project structure to better conform to the existing recommendations.
(Somewhat off-topic rant)
I find this pretty frustrating, because the general feeling in the packaging community is that we should be more willing to enforce an opinionated standard, with one "blessed" packaging tool that manages your project for you, like cargo
in the Rust community. And yet, that approach implies a standard project structure - which we already have in the src
layout - but only supporting that layout feels like a non-starter. I'm almost inclined to start a conversation on the Packaging Discourse, suggesting that if we, as a community, want to go down the route of standardising tools (as opposed to just interoperability protocols) then we should start by declaring the src
layout as the official PyPA project layout, and making a statement that tools are only required to support that layout. But it's borderline trolling, and I don't think the community needs another epic "strategy" debate right now...
In coherent-oss/coherent.build#2, I'm working on adding editable support for Coherent System projects that follow the essential layout. And while I recognize this project primarily aims to support the src layout, it would be nice to re-use the redirector functionality for projects using the essential layout.
At first blush, it doesn't appear to me as if the editables project can support (PEP 420) namespace packages in any layout. That is, if a project like
jaraco.functools
orcoherent.build
doesn't have any__init__.py
in the root namespace, it is incompatible with the redirector (because the target of a redirection needs a file location and there's no file location for a namespace package).Moreover, any redirector should probably support redirecting the inner module/package and not the namespace package so that the editable-installed package can exist alongside other packages sharing the same namespace (e.g.
jaraco.functools
editable-installed alongsidejaraco.context
installed on the file system).I believe it should be possible for the redirector to support redirecting namespace submodules. In particular, I'm imagining something like:
The redirector would then expose the parent
jaraco
module (if it doesn't already exist) and thejaraco.functools
module (from the indicated path).Maybe we'll want to add something to #24, but I wanted to start with a dedicated issue to discuss the possibilities and concerns.
Some questions:
Is this use-case something that was considered and rejected?
Do you have any specific goals that would prevent adding support to the Redirector (with or without support in EditableProject)?
Is there another construct/approach that's meant to address this need?