Closed carljm closed 1 month ago
The module names are listed at python/cpython@
main
/Modules/config.c.in#L35 -- currently they aremarshal
,_imp
,_ast
,_tokenize
,builtins
,sys
,gc
,_warnings
,_string
.
This isn't a complete list of the modules that are special-cased by the interpreter; I think there must be additional special-casing done elsewhere. A more complete list can be obtained using the sys.builtin_module_names
constant. Experimenting locally confirms that there are names included in that list which are not included in your list and are special-cased in the same way when it comes to module resolution.
We should add special-casing for these modules by writing a Python script to generate a Rust function similar to the one in https://github.com/astral-sh/ruff/blob/main/crates/ruff_python_stdlib/src/sys.rs, which can be queried to determine which modules are considered builtin on any given Python version.
It might be worth double-checking the list of paths with the paths that Pyright uses.
It might be worth double-checking the list of paths with the paths that Pyright uses.
From what I can tell, I'm not sure pyright ever respects first-party modules overriding stdlib modules. I can't find any relevant list of paths in the pyright source code; and from local experimentation, pyright appears to resolve random
to the stdlib module even if it's locally overridden (despite emitting a warning telling you that the local random.py
file is shadowing a stdlib module).
I have high confidence that the list of modules given by sys.builtin_module_names
is the correct one, however.
Yeah, sys.builtin_module_names
is the right complete list. It's the same special-casing (BuiltinImporter
), it's just that the file I linked is the input template to code generation that happens at CPython build time, resulting in an output config.c
module that has more built-in modules listed in the inittab. (This is because some built-in modules are conditionally built.)
Status update: https://github.com/astral-sh/ruff/commit/d8cf8ac2ef26bb630b43b095f61662173b2bac2f just added a hardcoded list directly into resolver.rs
, based on the builtin modules that I have locally on Python 3.12. We should still change this to be a generated Rust function similar to the one in https://github.com/astral-sh/ruff/blob/main/crates/ruff_python_stdlib/src/sys.rs.
Certain modules are imported by
BuiltinImporter
, which comes first insys.meta_path
by default (beforePathFinder
which implements the normal filesystem-based module resolution.) These modules will always take precedence over any module on the filesystem (i.e. first-party code can't ever shadow these modules, unless somebody starts rearrangingsys.meta_path
, which throws everything out the window and we can't reasonably support.)So our module resolver should special-case these modules to always resolve to typeshed, and never to user code. The module names are listed at https://github.com/python/cpython/blob/main/Modules/config.c.in#L35 -- currently they are
marshal
,_imp
,_ast
,_tokenize
,builtins
,sys
,gc
,_warnings
,_string
.(I also verified experimentally that I can't override these modules by creating e.g. a
gc.py
orsys.py
orbuiltins.py
file.)