Closed ofek closed 1 year ago
As the error says, the map
method is only intended to add top-level modules. Even if that restriction were lifted, the import hook doesn't support namespace packages (see the docs, but this is basically a limitation of how namespace packages are implemented by Python).
I'd be willing to accept a PR that added support for this use case, if someone can find a way to make it work, but it's not one I plan on working on myself.
IS this an issue on this project or in general? There's a similar issue in scikit-build-core
that we want to address, and I am wondering if there's any pointers for this.
The first thing needed is for someone to work out how to implement an import hook that provides the requested functionality. That's a general issue - there aren't many people working on doing clever things with import hooks, so there's not a lot of expertise in this area to draw on. If you manage to address your issue in scikit-build-core
, your solution might help here, and vice versa.
Once it's demonstrated that it is possible to implement the functionality, there is a follow-up question for this project as to whether I want to add that capability. That will depend on the complexity and maintainability of the support code - my goal for this project is to provide lightweight support for key use cases, not to provide every possible variation on "editable installation" that people can think of. Other projects may make different choices.
There is also a third question, which is more of a "community consensus" question than something any individual project can answer, which is what actually is an editable install. The original form of the concept was literally nothing more than a .pth
file which added the project source directory to sys.path
, and that served everyone fine for years. Features like this one were never even possible under that model. In this project, add_to_path
method exposes this mechanism. Anything beyond that is essentially a "new feature request" and (IMO) needs to be justified with use cases and a clear understanding of the costs and benefits. But again, that's only my view as it relates to this project. Other projects and developers may have a different position.
I believe this is quite valuable to have in order to modernize projects into using importlib.resources
e.g. I have a configuration for hatchling
:
[tool.hatch]
build.include = [
"python/my_pkg",
"data",
]
[tool.hatch.build.sources]
"python/my_pkg" = "my_pkg"
"data" = "my_pkg/data"
Where in the code I have:
importlib.resources
root = importlib.resources.files("my_pkg.data")
The project is not python centric so I cannot impose a file restructuring for that. I imagine other people who would want to add python bindings to a project would be having a similar issue.
If there is any progress on scikit-build-core
on this point I will ping you about it. @henryiii has the general gist of how to implement this and I will be looking at this project for some inspiration.
Hello,
It seems that I have a similar need for this Issue.
I am currently working on creating a python project template, and I would like to use my current developing package for my testing.
To describe it, in my project, I have a src
folder containing my source code, and a tests
folder containing my test.
I am using the rewriting system of hatch from [tool.hatch.build.sources]
to rewrite the src
to match my package name, with multiple folder as company.package.tools
.
Now comes my issue, I am currently thinking about a system to manage my package for the testing part.
Indeed, I want to test my package by importing it as it will be named when deployed.
So instead of importing my package as src
, I thought about importing it as company.package.tools
.
I also want to edit my current package as it will be in development phases.
So, I try to import my package as editable and execute:
pip install -e .
But, here come this issue. I do not know if this use case could be a motivation to develop a possible solution. Also, I do not know if this issue can be handled by pytest, I need to search more information on that. But, if you have an alternative solution for that, I am interested :)
Thanks for the explanations. These seem very much related to the hatch feature of remapping directories (not surprising as the request was raised by the hatch author). Is that feature unique to hatch? If not, how do other build backends handle this issue? And equally, how did you handle it before general editable install support was available, when you could only get editable installs using setuptools (and the .pth
file approach)?
My reason for asking is because I want to make sure that any features I add to this library are generally useful, and correspond to features common across backends, rather than having to maintain functionality that's only of use to a single backend.
I think it would help with having less code duplication. I have just tested the equivalent setuptools
approach and it works, (maybe except for dynamically generated files)
[tool.setuptools.package-dir]
"my_pkg" = "python/my_pkg"
"my_pkg.data" = "data"
And on the implementation side it points the .pth
to a python script which contains all of the generated aliases of the files. For reference:
Of course that is specific to setuptools and we would need to either re-implement this feature on each build system or have a common interface for it, which this project would make sense to live in.
As for being used by multiple backends, scikit-build-core
also needs this feature, not just hatchling
.
OK, thanks. So setuptools implemented this independently. I agree it would be useful to have it centralised, but setuptools chose not to use this library, so we'll end up with independent implementations. I'm not entirely comfortable having to monitor the setuptools implementation and replicate any fixes they make as an ongoing exercise. But we can worry about that later, for now let's just work on seeing if this can be added here. I will note, though, that I'm not interested in this library competing with setuptools, so "working like setuptools" is explicitly not a requirement here[^1].
If someone wants to make a PR implementing this, I will consider it. But some points I should make up front:
.pth
approach and the existing import hook).I'd also like it to have tests. The existing tests aren't great, but the more complexity gets added, the more important good test coverage will be.
[^1]: It's arguable that's what the add_to_path
method did, as that implemented the old setuptools approach. But setuptools changed their approach when they implemented PEP 660, which is fine, but I don't plan on trying to track their changes.
[^2]: That is, for backends. It's up to the backend whether they want to let the user choose what mechanism to use.
[^3]: In particular, what doesn't work.
Thinking some more about this, I think the important point here is to define what people expect to get from an editable install in the first place. Otherwise the scope of this project is at best ill-defined, and I will get continually asked to decide whether features are in scope. I'll open up a new issue for discussion on this question, but for now the basic principles I'm working from are as follows:
setuptools develop
mechanism, which simply put the project directory on sys.path
using a .pth
file. So that should be seen as the baseline functionality here.Also, this project will only ever provide a mechanism. Backends can (and may need to) supplement that mechanism with implementation details of their own (for example, building a link farm if they want to use that approach). There's certainly an argument for a common library that implements full editable installs, but this isn't that library - I'm only interested in the import hook side of the problem.
Indeed I understand the motivation to keep it simple and self-contained, it's just that it would be nice to have a common project so we don't have to re-invent the wheel. Recently I have tried to implement the approach from mesonpy and I find that there are quite a lot of nuances that one has to tackle in order to make an editable install be functional (beyond piggybacking onto spec_from_file_location
), and it would be nice to have these nuances figured out in a common place.
Not all project layouts and workflows need to be supported.
Indeed, for purely python based projects these are unnecessary. But at the same time, wouldn't the built-in implementation also be sufficient? I am a bit confused as to what this project does on top of what's already available using .pth
redirection if it's not a more fine-grained control of the redirections, which subpackages seem to be part of that.
But indeed this feature is quire complex because you have to deal with the differences of module
, package
, extension
and namespace
, which is non-trivial and sparsely documented, and even the implementations out there seem to fail in edge-cases. I have something half-functional in https://github.com/scikit-build/scikit-build-core/pull/399, which should be equivalent to mesonpy's approach, but I can't figure out the edge-cases of it.
it's just that it would be nice to have a common project so we don't have to re-invent the wheel
Agreed. As long as it's a wheel that is needed 🙂
But at the same time, wouldn't the built-in implementation also be sufficient? I am a bit confused as to what this project does on top of what's already available using .pth redirection if it's not a more fine-grained control of the redirections, which subpackages seem to be part of that.
IMO, yes, the .pth
approach is almost always sufficient, especially now that the src
layout is the recommended approach for project structure. This project grew out of the demand for a prototype implementation of an "import hook" based approach that avoided the problem of import setup
picking up setup.py
in flat project layouts. The alternative approach had limitations of its own that I view as acceptable trade-offs for something that isn't needed for a src
layout.
As I said, I'm happy to look at extending the import hook machinery, but I want to keep it focused on genuine use cases, and part of that is insisting that proposals clarify precisely why the "src
layout plus .pth
file" approach is insufficient.
You've made the case here for needing a solution when you want to map what's in the src
directory to a subpackage of a namespace in the target interpreter (if I've understood your use case correctly), so as I said, a PR for that would be welcome. But that's a specific feature based on an actual use case, not a general principle that "the import hook method should be the norm". And in fact, it's possible that remapping might not even need an import hook. You should consider whether installing the namespace with a __path__
attribute that points to the src
directory might work just as well, with no import hook needed (I've not tried it, so that's just speculation - if I get some spare time, I might do some experiments).
As a side note, one significant disadvantage of import hooks is that because they work at runtime, they do not interact at all well with static type checkers and tools like IDE tooltips. This is a known issue, and it's been raised with the typing community, who don't have a solution other than standardising some form of (as yet unspecified) static representation of the mapping the hook implements.
Essentially, a user wants to ship src/pkg as namespace/plugins/pkg which is fine except for editable installations
I just did some testing. You can do this by installing a single namespace/plugins/__init__.py
file containing the single line
__path__ = ["/path/to/directory/src"]
I've lightly tested this and it seems to work. I could easily add this to editables
as a new function. It would not require an import hook to function, just a single file added to the list that need including in the editable wheel.
Try to run something like:
import importlib.resources
root = importlib.resources.files("my_pkg")
resource_root = root.join_path("resources")
for file in resource_root.iterdir():
print(f"{file.name}")
(Assuming the structure is src/my_pkg => my_pkg
, resources => my_pkg.resources
)
In this context resources is something shared between python and other system, e.g. systemd
files, html files, data files, etc.
Isn't that a bug in importlib.resources
? After all, __path__
is part of the core import machinery, so if importlib.resources
doesn't support it, that's their problem, surely?
Well, no because path is used for the module, which is all fine, but after you call files()
you get a Path
object which can only know the information of the local system path. That's the join_path
which on a Path
simply adds to the path, while in this case it need to be provided the information that there are other effective directories symlinked there.
Hang on. You want to map a directory containing no Python code to a subdirectory of a package? Why can't you put the resources
directory under src/my_pkg
in the first place? I don't plan on supporting arbitrary remappings like this - the scope here is exposing python modules and packages in the source as importable modules/packages in the target interpreter. Arbitrarily restructuring the directory layout of the source project is way beyond that. If a backend wants to provide that sort of functionality, it can implement it independently, of course, but it's not something I want to support in this library.
For the original request here, shipping src/pkg
as namespace/plugins/pkg
, the __path__
solution seems fine.
The resources is part of a broader project's files, e.g. a C library where the Python bindings are an optional support. The init.py can still be there in resources thus making it an effective package. The thing is though you cannot navigate through the packages via join_path after you've called files(). You can import each package and navigate within them, but not across them.
It might help to show example directory trees and their locations to illustrate what you're trying to do
In fact, @LecrisUT please provide a standalone example of a project that demonstrates what you want. If you provide a full, buildable example that installs (using a non-editable install) the way you want it to and which shows the issue that you want supported by editable installs, I'll look at whether this is something we can handle.
@ofek I may be mistaken, but I think that the use case @LecrisUT is describing is a little different from the feature you asked for originally on this issue. With that in mind, would the __path__
solution I proposed above be a suitable addition to satisfy your original feature request?
For the resources
directory case, another option is that when doing an editable install, the backend could write a symlink into the package directory pointing to resources
. I won't do that in this project because:
Yes, that would work for me!
One thing to note, when I come to implement this. If you want to graft directory X
into the import system as package foo.bar
, it should not contain an __init__.py
itself (the __init__.py
for foo.bar
is supplied by the editable project implementation, and just sets __path__
).
So the directory X
can contain modules (core.py
, gui.py
etc.) and packages (subdirectories with their own __init__.py
), but cannot have an __init__.py
of its own. Essentially, X needs to be a pure namespace package.
Is that sufficient for you, or is that limitation (which I will document) going to be a problem? I'll probably implement it anyway, as it's simple enough and works without an import hook, which is beneficial, but I don't want to misrepresent it if it won't actually address your use case.
Thinking about this:
Essentially, a user wants to ship
src/pkg
asnamespace/plugins/pkg
It should be fine if:
src
has no __init__.py
(which it shouldn't)pkg
in src
(or you're OK with anything else that is there being visible under namespace.plugins
as well)In that case, project.add_to_subpackage("namespace.plugins", "src")
will do the job (bikeshedding over the method name is OK, if you can think of something better 😉)
That example you gave, what would that particular call return?
('namespace/plugins/__init__.py', '__path__ = ["/absolute/path/to/src"]')
Technically, of course, add_to_subpackage
doesn't return anything, but it adds that tuple to what gets returned by files()
.
Yeah that would work in my case and I think would be simple to start using
Added in #28
I will do a new release in the next day or so. If you have a chance to test the new implementation before then, that would be great (but no pressure, I'm not going to hold up the release if you're busy).
Assuming the structure is
src/my_pkg
=>my_pkg
,resources
=>my_pkg.resources
I have updated the documentation to be explicit that in the call project.map(pkg_name, directory)
, pkg_name
must be a top-level (undotted) name, and directory
must be a Python package (or .py
file). The code already raises an exception if either of these two constraints aren't met, so it's simply a matter of being explicit in the docs.
I've also noted that the files specified by the library must be installed unchanged, and must not clash with any files that the caller writes in addition. This should cover the point that project.add_to_subpackage("namespace.plugins", "src")
cannot be used if the wheel already contains a namespace/plugins/__init__.py
file (which is something I don't support, but can't enforce).
The resources
mapping in the quoted layout fails on three counts:
map
is used)resources
directory isn't a Python package (I'm assuming) (a problem if map
is used)my_pkg
has an __init__.py
file from the first mapping (a problem if add_to_subpackage
is used)I don't have any immediate plans to support that use case. The user can add a symbolic link from src/mypkg/resources
to resources
, or hatch
can do this as part of setting up the editable install, but I consider it as out of scope here.
Closing as #28 is now merged.
Release 0.4 has now been uploaded with this included.
This might be impossible.
Essentially, a user wants to ship
src/pkg
asnamespace/plugins/pkg
which is fine except for editable installations as doingeditable_project.map('namespace.plugins.pkg', 'src/pkg')
produces: