python / importlib_metadata

Library to access metadata for Python packages
https://importlib-metadata.readthedocs.io
Apache License 2.0
126 stars 81 forks source link

.metadata is an empty message even for missing metadata #493

Open jaraco opened 3 months ago

jaraco commented 3 months ago

In https://github.com/python/importlib_metadata/issues/489#issuecomment-2195177710, I learned that Distribution.metadata will return an empty PackageMetadata even when there is no metadata file present.

Although not part of the recommended operating practice, it seems to be occasional occurrence for an uninstalled package to leave a lingering metadata directory (*.dist-info) with nothing in it. Instead, I would expect dist.metadata to be None when no metadata file is present, and to return an empty PackageMetadata object when a metadata file is present but empty.

It seems the reason this expectation was previously missed is because (a) this case hasn't previously been tested, (b) the typespec indicates that email.message_from_string accepts only a str, and (c) the implementation suppresses the type error (introduced in d3908983078a47acecf430115b2bcc2d104f54a6).

The underlying reason email.message_from_string accepts None is because io.StringIO accepts None and is indistinguishable from io.StringIO('') for reading.

jaraco commented 3 months ago

In 597776e8e4, I started exploring an optional None return type from .metadata(), but that does violate other interfaces (just internally):

_________________________________________________________ importlib_metadata/__init__.py _________________________________________________________
465: error: Value of type "PackageMetadata | None" is not indexable  [index]
475: error: Value of type "PackageMetadata | None" is not indexable  [index]
972: error: Incompatible return value type (got "PackageMetadata | None", expected "PackageMetadata")  [return-value]
1041: error: Value of type "PackageMetadata | None" is not indexable  [index]
_______________________________________________________ importlib_metadata/compat/py39.py ________________________________________________________
23: error: Value of type "PackageMetadata | None" is not indexable  [index]
drfho commented 2 months ago

Dear @jaraco I observed a systematic KeyError when starting application server Zope https://github.com/zopefoundation/Zope/ Some packages (importlib_metadata etc.) may not provide a metadata-name and thus Zope server cannot start anymore.

https://github.com/python/importlib_metadata/blob/f3901686abc47853523f3b211873fc2b9e0c5ab5/importlib_metadata/__init__.py#L1036

To avoid an error the critical line can be 'tried' like

            try:
                pkg_to_dist[pkg].append(dist.metadata['Name'])
            except:
                pass

or a default value will prevent it:

                pkg_to_dist[pkg].append(dist.metadata.get('Name','unknown'))

image

I suppose you got a better idea about that?

Thank you very much and best regards f

jaraco commented 2 months ago

@drfho, See #371 for the issue that describes why dist.metadata[missing] was changed from returning None to raising a KeyError (mainly for consistency with a typical Mapping). This change is unrelated to the issue at hand, so I'll be marking these two comments as off-topic. This issue is about whether dist.metadata should return None when there is no METADATA file in the .dist-info directory.

Your two ideas on how to approach the issue in Zope are both valid and it depends on your use-case. Note that on importlib_metadata<8 and importlib.metadata on Python 3.13 and earlier, dist.metadata['Name'] could still be None, but dist.metadata.get('name', 'unknown') will work consistently on all versions. I don't really have any better ideas.

If you have more questions regarding the behavior of .__getitem__ (i.e. dist.metadata[name]), please follow up in #371 or open a separate issue. Thanks.