python / importlib_metadata

Library to access metadata for Python packages
https://importlib-metadata.readthedocs.io
Apache License 2.0
128 stars 82 forks source link

AttributeError changed to AssertionError for invalid identifiers #488

Open hroncok opened 6 months ago

hroncok commented 6 months ago

Since https://github.com/python/importlib_metadata/pull/449 trying to use missing extras raises AssertionErrors. I believe that invalid inputs should never raise AssertionErrors (neither should asserts be used to validate user input).

Observe on Python 3.12 (before this change):

>>> ep = importlib.metadata.EntryPoint(name='foo', value='invalid-identifier:foo', group='foo')
>>> ep.extras
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python3.12/importlib/metadata/__init__.py", line 222, in extras
    return re.findall(r'\w+', match.group('extras') or '')
                              ^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'group'

Now this exception message is bogus but at least this is an AttributeError (hence code reading this might assume the .extras attribute is missing).

However, on Python 3.13:

>>> ep = importlib.metadata.EntryPoint(name='foo', value='invalid-identifier:foo', group='foo')
>>> ep.extras
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    ep.extras
  File "/usr/lib64/python3.13/importlib/metadata/__init__.py", line 198, in extras
    assert match is not None
           ^^^^^^^^^^^^^^^^^
AssertionError

Existing code (e.g. setuptools) assumes AttributeError in this case:

See https://github.com/pypa/setuptools/blob/544b332bd78d8d274597923e89b9bd7839f8a0f4/setuptools/_entry_points.py#L18-L25

I believe both behaviors are wrong, but the new one is "more wrong".

May I suggest that either .extras or .__init__ raises a specific exception when this happens?

jaraco commented 5 months ago

Thanks for the report. Without looking at the code, I agree the API should provide a documented expectation for erroneous inputs, especially if callers are relying on that interface to perform validation (as seems to be the case).

I suspect what was going on here is the code is basically expecting valid input and providing some best-effort checks to validate the input, but not expecting the exceptional case to be encountered except under broken callers.