Open paugier opened 3 years ago
Do you have any suggestions for how to do this? The difficulties are as follows:
.hpy.so
file should loadable by different Python implementations, and in particular the arguments and return value of HPy_MODINIT
should be ABI compatible)import hpy; hpy.install_hy_universal_importer()
It would definitely be great not to have the .py
stub files for universal modules.
The current plan is to ask CPython people if they can recommend or add a way to do this.
Is CPython prioritising extension module over .py files a guaranteed behaviour, or an implementation detail that gets exploited and relied on? I get this can be useful, but maybe it could still be not a good thing to do?
One method would be to use a *.pth
file, which site
executes import statements inside. Then it’ll automatically be loaded.
Is CPython prioritising extension module over .py files a guaranteed behaviour, or an implementation detail that gets exploited and relied on? I get this can be useful, but maybe it could still be not a good thing to do?
It's the order in which the FileLoader
classes are defined in _get_supported_file_loaders
This seems to not have changed in a long time, even Built-in Package Support in Python 1.5 mentions it.
One method would be to use a
*.pth
file, whichsite
executes import statements inside. Then it’ll automatically be loaded.
A good idea, although there is talk of deprecating import ...
statements in .pth
files. Putting something in site.py
or sitecustomize.py
could be another option. Perhaps each hpy.universal
implementation could install such a .pth
file and then built HPy Universal ABI modules could be loaded directly without needing a stub .py
file.
It's the order in which the
FileLoader
classes are defined in _get_supported_file_loadersThis seems to not have changed in a long time, even Built-in Package Support in Python 1.5 mentions it.
Hmm, interesting. So the 1.5 documentation contains this line:
Tip: the search order is determined by the list of suffixes returned by the function imp.get_suffixes().
imp
is deprecated, and the current documentation says:
Deprecated since version 3.3: Use the constants defined on
importlib.machinery
instead.
But importlib.machinery
does not have an equivalent constant; the only thing close to it is all_suffixes()
. And its value does not match the implemented import logic:
_get_supported_file_loaders()
lists extensions first, then .py
and .pyc
all_suffixes()
lists .py
, .pyc
, and extensions last.So it seems like there is no inherited logical ordering to the extensions, and the fact that all_suffixes()
went in without anyone wanting it ti match the actual import logic suggests to me that it’s not the core devs don’t feel the order extensions are tried is a specified thing?
Is CPython prioritising extension module over .py files a guaranteed behaviour, or an implementation detail that gets exploited and relied on? I get this can be useful, but maybe it could still be not a good thing to do?
@uranusjr, I don't understand your point. Do you mean that HPy doesn't need to care about this because it is not well documented in importlib
?
Prioritizing an extension over a .py file is a consistent behavior since Python 1.5. This is a very simple and reasonable behavior. If someone adds a extension next to a .py file, it's because it's better to use the extension. This priority is used by tools like Cython and Pythran, and I guess internally by other packages. There is no reason to change that and I don't see why it would change in future Python versions. So I guess if HPy can't get this behavior, one can write a PEP to propose this breaking change and propose a nice alternative :-)
A strong point of HPy is that the whole transition seems doable because for most packages without hand written C code (for example scikit-learn or scikit-image), switching to HPy will imply mostly changing few lines in setup.py / pyproject.toml. If no nice solution is found for this issue, it won't be like that at all.
I wonder if it could be technically possible to provide a function hpy.make_universal_extensions_importable
that can be called early in the init process of the packages so that universal extensions would be importable without stub .py
files ?
Yeah, I’m more or less trying to imply that a) this probably needs a clarification from CPython core devs, and b) if the behaviour is not intended to be relied on, this could be considered out of scope of HPy (clarification: this is most definitely only a personal opinion without any relation to HPy developers).
There are many ways to achieve the “extension over pure Python” goal without relying on file extension ordering, e.g. have an optional _mylib_speedup
and try-catch from _mylib_speedup import *
. This would definitely means additional work the transition, but adopting HPy already requires modifying your build scripts, so I’d argue that may not be a bad thing if this is not something to be relied on.
Note that if CPython says the behaviour is promised to be reliable, none of the above applies, and HPy should definitely support the use case (and CPython should probably fix importlib.machinery.all_suffixes()
). My main point is someone should get that clarification first before trying to solve the problem.
One method would be to use a
*.pth
file, whichsite
executes import statements inside. Then it’ll automatically be loaded.
this is an interesting idea. The biggest pro is that by having a proper import hook, we can enable complex behavior (e.g. loading specific modules in debug mode depending on the value of an env variable and/or a config file).
The biggest cons are:
The cleanest solution would probably be a custom FileLoader class, right? How (un)likely is it to get that kind of support from the CPython developers?
thank you for asking this @paugier! I look forward to see what is the official answer. I suppose we can safely assume that the current behavior will never change, but it's better to have some official clarification.
As for this issue, I think that the only reasonable solution is to write a custom importer as @cklein suggests, and install it using the "pth
hack", until we find a better solution and/or CPython provides a cleaner way to do it.
As usual, if anyone feels like working on that, contributions are welcome :)
For the editable installs Meson-python does something which could be used for HPy:
A Python file and a .pth file are added to the wheel: https://github.com/mesonbuild/meson-python/blob/041ed596e0c43ebb6076c38a0b59aa2e71e687b5/mesonpy/__init__.py#L504
The .pth file just imports the .py file.
The Python file is a copy of mesonpy/_editable.py + one call.
This call adds to sys.meta_path
a meta path finder which changes how extensions are imported.
HPy could also provide a Python API to ensure that its own MetaPathFinder is added.
Some Python compilers (for example Cython in pure Python mode and Pythran) use the priority to an extension when importing a module. When importing
foo
, iffoo.py
andfoo.so
exist,foo.so
is imported.With some Python compiler, the developer code in Python in a file
foo.py
and call a Python compiler to generate an extensionfoo.so
. Under the hood, a .c or .cpp is created and compiled. In Python, importingfoo
uses the extension because Python prioritizes the extension.This mechanism seems to be incompatible with the way HPy works by now (adding a Python file with the same name as the extension). Therefore, it would be better to find another way to make universal extensions importable. In particular, the standard behavior to prioritize an universal extension over a Python file should be obtained.
I realize that this is not a high priority problem for HPy! But I think it is worth mentioning the issue.