Closed jherland closed 7 months ago
Hmm. I agree - that's a little annoying that the .pyc file is showing up in the RECORD file, but that sounds like a bug in the builder. If a project legitimately wishes to install a __pycache__
into site-packages, it will be importable, so it probably should show up as a top-level module installed by that distribution.
draft $ mkdir __pycache__
draft $ py -q
>>> import __pycache__
>>> __pycache__.__path__
_NamespacePath(['/Users/jaraco/draft/__pycache__', '/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/__pycache__', '/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/__pycache__'])
You may be right that it makes sense to explicitly exclude __pycache__
, as it's commonly going to exist in sys.path
even when it wasn't installed by a package. But as long as it's explicitly mentioned in the RECORD, I'm tempted to say that importlib metadata should honor that record.
Is there a legitimate reason for the __pycache__
entry to be present in RECORD?
Since the directory does appear to be importable and since the package is explicitly reporting it as a top-level directory installed by the package, I'm going to say importlib_metadata is doing the right thing here and should be reflecting the presence of that top level name. I'd like to avoid special cases where possible.
Feel free to advocate for making a case for excluding that name. In particular, I'd like to know how the presence of this name causes harm (beyond a little surprise).
Your argument makes sense to me from the POV of importlib_metadata. I agree that it rather sounds like a bug in the builder.
The context I'm coming from is doing dependency checking where we need to be able to map from an import name we find in the user's code to a package name to be found in the user's declared dependencies (e.g. requirements.txt
, pyproject.toml
or similar). In this case, a package reporting __pycache__
as an "extra" import name is probably not going to create much of a problem for us, as we don't expect the user to import __pycache__
in their code. If this does end up becoming problematic, we can easily enough special-case it in our use of importlib_metadata.
Thanks for your consideration!
I doubt this has anything to do with the builder - pip generates pyc
files on install as per the wheel spec, which it appends to the RECORD
so that it will know to uninstall them. Because typing_extensions
is not a package but a single-file module, its __pycache__
is at the top level, which is why it is being returned by packages_distributions
.
Related to #442 (metadata directories appearing in packages_distributions).
Some packages include pre-compiled
.pyc
files inside__pycache__
directories. When notop_level.txt
is present, theRECORD
file will be used to infer top-level import names, and for some packages it will contain entries like this (example taken fromlib/python3.10/site-packages/typing_extensions-4.4.0.dist-info/RECORD
in my Poetry virtualenv):This entry is included in
files()
and is then further interpreted by_top_level_inferred()
as a__pycache__
import name being provided by this distribution/package.The code probably needs an exception for
__pycache__
directories. Maybe this is simple enough?