python / importlib_metadata

Library to access metadata for Python packages
https://importlib-metadata.readthedocs.io
Apache License 2.0
122 stars 79 forks source link

Ensure stale empty python module directories don't break the build #489

Closed kanavin closed 3 days ago

kanavin commented 1 month ago

The Yocto project has been seeing mysterious build failures that were traced down to stale empty python module directories left behind by previous module versions: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14816

We are carrying the following patch to address the issue (it both ignores the empty directories, and looks at directories in deterministic sorted order):

--- a/Lib/importlib/metadata/__init__.py
+++ b/Lib/importlib/metadata/__init__.py
@@ -710,7 +710,14 @@ class Lookup:
         self.infos = FreezableDefaultDict(list)
         self.eggs = FreezableDefaultDict(list)

-        for child in path.children():
+        for child in sorted(path.children()):
+            childpath = pathlib.Path(path.root, child)
+            try:
+                if childpath.is_dir() and not any(childpath.iterdir()):
+                    # Empty directories aren't interesting
+                    continue
+            except PermissionError:
+                continue
             low = child.lower()
             if low.endswith((".dist-info", ".egg-info")):
                 # rpartition is faster than splitext and suitable for this purpose.

We'd like to discuss with upstream if such a fix is appropriate and will be taken as a proper pull request.

This is a cross-post from https://github.com/python/cpython/issues/120492

FFY00 commented 1 month ago

Thanks for reaching out! Do you have any simple reproducible that triggers the issue?

In your use-case, are .dist-info or .egg-info directories left behind? If so, I think we could add some code to validate the metadata directories, but it also points out to another issue — why aren't these directories being cleaned up?

kanavin commented 1 month ago

I would need a bit of time to confirm a reproducible on my side, but I believe it happens when files from the previous module version are removed, but not the directory tree they were in. This is a peculiarity (or shortcoming if you will) of bitbake build system used in yocto which tracks files but not directories.

jaraco commented 4 weeks ago

The proposed patch makes four changes:

That feels suspiciously like it's addressing four different issues. If we're going to tackle this issue, we'll want to distill the motivations for each of those changes and determine if they're appropriate and desirable in general. Before we embark on that, let's look at the issue in the title.

Addressing the core issue (stale empty python metadata directories), that's been discussed already in #457, except there they're proposing to raise an error and not suppress the undesirable state.

In that issue, I explain that importlib_metadata attempts mainly to provide a low-level interface for the metadata that is present, meaning that if there is an empty metadata directory, importlib_metadata wants to present that as a package with no metadata, so that you as a consumer can inspect all of the packages that are seemingly present (even if they're somewhat or mostly corrupt). For that reason, I don't want to simply silently ignore this situation.

This project does expose a diagnose command to help users identify a corrupt environment py -m importlib.metadata.diagnose, though that command could use some love.

I'm trying to understand better what "break the build" means. What operations are failing and how?

Attempting to simulate the "empty dir" experience, I did the following:

 draft @ mkdir foo.dist-info
 draft @ py -q
>>> import importlib.metadata
>>> dist, = importlib.metadata.distributions(path=['.'])
>>> dist.metadata
<importlib.metadata._adapters.Message object at 0x2f7e4585a10>
>>> dist.metadata.keys()
[]
>>> dist.metadata.get('Version')
>>> dist.name
>>> bool(dist.metadata)
False

Oh! That last bit is interesting. Since there's no METADATA file, it seems one could readily use that as a signal to ignore that dist. e.g.

valid_dists = filter(operator.attrgetter('metadata'), importlib.metadata.distributions())

And since entry_points already does some filtering (for unique dists), it maybe should apply such filtering.

I should also note that this project is highly performance sensitive, so anything that adds extra stat calls during discovery is likely to be a deal-breaker unless carefully designed (but we can worry about that later after figuring out a good strategy for behavior).

I'm slightly surprised that dist.metadata returns an empty PackageMetadata and not a None value, given that the metadata file is in fact not present. I'd expect an empty PackageMetadata if an empty METADATA file was present.

Aah. It seems that email.message_from_string(None) returns an empty Message object. That's unfortunate, given that the typespec indicates a str is required. I'll file a separate issue for this concern.

Tell me more about the breakage you observe and how your patch addresses it, and we can strategize on what to do next.

kanavin commented 4 weeks ago

Tell me more about the breakage you observe and how your patch addresses it, and we can strategize on what to do next.

The failures we were getting with empty directories are shown here at the top: https://bugzilla.yoctoproject.org/show_bug.cgi?id=14816

Do they make sense to you? Can you trace to where the trouble starts?

jaraco commented 4 weeks ago

Do they make sense to you? Can you trace to where the trouble starts?

Yes, that helps. I know you'd referenced before and I was hoping you'd have already distilled the issue to the proximate cause in this package. I'm happy to take a look.

In there, I can see that setuptools is using nspektr and importlib_metadata to validate that the declared dependencies are installed. As it works through the dependency tree, it encounters the invalid distribution and fails to validate the version when it is None.

nspektr is calling importlib.metadata.distribution(name) where name is some packaging indicated by an entry point's extras.

So the cause is rooted in the way that Distribution.from_name returns the first dist it finds, ignoring subsequent ones that might be preferable.

Perhaps that routine should be updated to filter on only valid dists or maybe give preference to valid dists.

I also note that this use of nspektr was removed in https://github.com/pypa/setuptools/pull/3421 (Setuptools 63), so the bug may no longer be impacting Setuptools. Is it possible that the issue no longer affects the Yocto project because those older versions of Setuptools are no longer needed?

kanavin commented 4 weeks ago

I also note that this use of nspektr was removed in pypa/setuptools#3421 (Setuptools 63), so the bug may no longer be impacting Setuptools. Is it possible that the issue no longer affects the Yocto project because those older versions of Setuptools are no longer needed?

That seems to be the case. I went to older yocto and the issue was easily triggered there, but it doesn't seem to be happening in recent yocto (I created lots of empty dist-info dirs in site-packages/ with various versions to be sure).

So we'll drop the patch and see if this uncovers any other issues. In all likelihood it won't, but I'll report if something happens.

rossburton commented 4 days ago

I just discovered some build breakage in my Yocto that I bisected to this patch being removed from Yocto. The failing package is meson-python and it fails like this:

| DEBUG: Executing shell function do_compile
| * Getting build dependencies for wheel...
|
| Traceback (most recent call last):
|   File "/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/build/__main__.py", line 178, in _handle_build_error
|     yield
|   File "/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/build/__main__.py", line 429, in main
|     built = build_call(
|             ^^^^^^^^^^^
|   File "/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/build/__main__.py", line 238, in build_package
|     out = _build(isolation, srcdir, outdir, distribution, config_settings, skip_dependency_check, installer)
|           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|   File "/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/build/__main__.py", line 172, in _build
|     return _build_in_current_env(srcdir, outdir, distribution, config_settings, skip_dependency_check)
|            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|   File "/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/build/__main__.py", line 151, in _build_in_current_env
|     missing = builder.check_dependencies(distribution, config_settings or {})
|               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|   File "/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/build/_builder.py", line 235, in check_dependencies
|     return {u for d in dependencies for u in check_dependency(d)}
|            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|   File "/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/build/_util.py", line 53, in check_dependency
|     if req.specifier and not req.specifier.contains(dist.version, prereleases=True):
|                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|   File "/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/packaging/specifiers.py", line 906, in contains
|     item = Version(item)
|            ^^^^^^^^^^^^^
|   File "/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/packaging/version.py", line 200, in __init__
|     match = self._regex.search(version)
|             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
| TypeError: expected string or bytes-like object, got 'NoneType'
|
| ERROR expected string or bytes-like object, got 'NoneType'
rossburton commented 4 days ago

I added some print statements to identify what dependency it was probing:

/work/ross/build/tmp/work/aarch64-linux/python3-meson-python-native/0.16.0/recipe-sysroot-native/usr/lib/python3.12/site-packages/meson-1.5.0.dist-info

And what do we see in the build tree:

$ ls meson*
meson-1.4.0.dist-info:
COPYING  entry_points.txt  METADATA  RECORD  top_level.txt  WHEEL

meson-1.5.0.dist-info:

A stale empty meson-1.5.0.dist-info directory. In this case, it's the build package which is breaking.

kanavin commented 3 days ago

Yep, this is different and not involving setuptools. So we need to undo the patch removal, unless @jaraco can suggest a better option? Can you look at Ross's traceback please?

jaraco commented 3 days ago

Perhaps that routine should be updated to filter on only valid dists or maybe give preference to valid dists.

I'm now thinking for Distribution.from_name should give preference to valid dists. I'll implement that change.

jaraco commented 3 days ago

The latest version of importlib_metadata (8.1.0) implements the new behavior (preferring non-empty distributions). Please share feedback. Especially, let me know if it doesn't address the issue, if it introduces unexpected regressions, or if you could benefit from a backport to 7.x.

kanavin commented 2 days ago

The latest version of importlib_metadata (8.1.0) implements the new behavior (preferring non-empty distributions). Please share feedback. Especially, let me know if it doesn't address the issue, if it introduces unexpected regressions, or if you could benefit from a backport to 7.x.

I think the issues we're seeing are coming from the copy in cpython source tree: https://github.com/python/cpython/tree/main/Lib/importlib/metadata

I'm not sure how is that synced up with this repo, but we'd appreciate the fix showing up there one way or another in both cpython master and 3.12 branches.

jaraco commented 2 days ago

how is that synced up with this repo

I will periodically apply changes from here into CPython.

Currently, the changes to importlib_metadata are slated to go into CPython 3.14 and later. Of course bugfixes can be backported, but I wasn't thinking of this change as a bugfix as much as a new feature. It changes the designed behavior from finding the first (and presumed only) dist in the filesystem to preferring ones with metadata. It's adding new behavior that adds support for previously unsupported (corrupt) environments. That really doesn't feel like a bugfix.

One of the reasons I maintain the backport is to make changes like this available from the future, so users that require this behavior can adopt it now (for Python 3.8+) and then move to importlib.metadata where available.

I know how icky it is to have to carry a dependency and version-switched imports, but I also want to honor the regime we've created.

In light of that, would you still make the case that this is a bugfix?

kanavin commented 2 days ago

In light of that, would you still make the case that this is a bugfix?

Not necessarily. I'm just trying to understand how we can test the fix, if it won't show up in either 3.12 or 3.13. We can of course not test it now, and just carry the same patch all the way to 3.14.

jaraco commented 1 day ago

If you want to test the fix earlier, you'll need to adopt importlib_metadata (install the dependency and then import importlib_metadata in place of import importlib.metadata). You're also welcome to leave the patch and await Python 3.14.

kanavin commented 1 day ago

I think we'll wait. We integrate a complete system, so patching all components to import a separate dependency instead of core python library isn't an option.