python / importlib_metadata

Library to access metadata for Python packages
https://importlib-metadata.readthedocs.io
Apache License 2.0
123 stars 80 forks source link

version 6.0.0 change of making Distribution an abstract class breaks some projects #422

Closed Psycojoker closed 1 year ago

Psycojoker commented 1 year ago

Hello,

As stated in the changelog:

419: Declared Distribution as an abstract class, enforcing definition of abstract methods in instantiated subclasses. It's no longer possible to instantiate a Distribution or any subclasses unless they define the abstract methods.

Please comment in the issue if this change breaks any projects. This change will likely be rolled back if it causes significant disruption.

This change indeed breaks at least 2 projects (logilab.common and pytest-capture-deprecated-warnings). Both those projects are using importlib_metadata this way:

{x.metadata["Name"].lower(): x.metadata["Version"] for x in importlib_metadata.Distribution().discover()}

(this might not be the best way to use this lib)

While not having to use the 2 required methods:

TypeError: Can't instantiate abstract class Distribution with abstract methods locate_file, read_text

Those projects aren't very big in the python ecosystem and I'm the process of fixing them (by creating a subclass with the required methods) but since the changelog is asking for a report here it is.

Cheers and thx for your work :heart:

brechtm commented 1 year ago

This change also breaks rinohtype: brechtm/rinohtype#389. Things seems to work fine with the stdlib's importlib.metadata.

jaraco commented 1 year ago

I somehow managed to miss this report until just now. I'll take a look at this issue soon.

jaraco commented 1 year ago

Thanks both for your understanding while we try to make the implementation more correct.

This change was requested to be added to CPython as well, to land in Python 3.12, and I need to decide how dangerous this change is.

The options as I see it are:

Incorporate

Since the projects that were impacted have addressed the compatibility, I'm tempted to say it should be safe to introduce the incompatibility into the stdlib as well. That will have the effect of making these libraries (rinohtype, logilab, pytest-capture-deprecated-warnings) non-viable for newer Pythons and older releases of the libraries.

Deprecation Period

The second option is a lot of work with very little benefit (now that affected projects have adapted).

Roll back

This third option restores the prior expectation and re-opens the opportunity for improper behavior, leaving the code in a somewhat inconsistent state.

Drop ABC

The fourth option is somewhat attractive, given that such behavior was apparently desirable in at least three instances. Since the spec for read_text is to return None when a file is not found, that fits well. I'm less confident about locate_file, which unconditionally returns a Path object in the PathDistribution case. Probably it should raise an error if a path cannot be supplied, similar to what rinoh does.

jaraco commented 1 year ago

I'm the process of fixing them (by creating a subclass with the required methods)

I tried to look into these projects to see their source, but I couldn't find either of them. I did find logilab-common on PyPI, but the links to the project sources are broken.

I see now that the issue is that one shouldn't be constructing Distribution() objects at all. You can, however, use Distribution.discover() (the classmethod). That would be the preferred way to work around the issue.

jaraco commented 1 year ago

@brechtm I appreciate the toil this change has caused by not rolling back the change more promptly, so I'm sensitive to asking you for more input, but I'd really like your opinion - what would you like to see from this package before applying the change to Python 3.12? I'm leaning toward the first or last options. If we chose the first option, rinoh would need to continue to implement the not-implemented methods. If we go with the last option, rinoh would at some point be able to drop those concrete not-implemented methods.

jaraco commented 1 year ago

After mulling it overnight, I'm now feeling stronger toward the Drop ABC option, because there is at least one user (rinoh) that expects not to have to implement those methods.

jaraco commented 1 year ago

After drafting the implementation in #448, I'm less confident in that option. I'm continuing to waffle between that and just pushing the existing behavior upstream to stdlib.

I guess the real question is a strategic one - should Distribution be lenient to a partial implementation, or should it require subclasses to explicitly acknowledge their partial implementation by supplying NotImplemented methods?

When I phrase it that way, I do think it's better to structure it such that it's more constrained by default, because most subclasses are expected to need to implement these methods.

Let's try incorporating the 6.0 changes into Python 3.12 and see if that reveals any additional concerns or if we can keep the stricter implementation.

brechtm commented 1 year ago

Thanks for asking for input! I don't have a strong opinion about this specific issue since this is handled in rinohtype now (but not yet in the latest release, I think). Assuming you're using symantic versioning, I can't really complain about 6.0.0 being backwards-incompatible, either.

In general, I am a bit worried about maintaining compatibility with the different versions of importlib.metadata in the Python versions I want to support (all that haven't reached EOL) and all of the versions of importlib-metadata. I want to avoid setting version constraints as that might cause conflicts with other packages that do (e.g. Sphinx). To help with this, it would be good if future versions of importlib-metadata maintain backward compatibility as much as possible. In this case, that means making Distribution allow a partial implementation.

jaraco commented 1 year ago

In https://github.com/python/cpython/pull/103584#issuecomment-1511502410, I see that pip is also affected, indicating that option 1 may not be viable, so now I'm leaning toward option 2.

jaraco commented 1 year ago

In #451, I've drafted an implementation of option 2.