Open JP-Ellis-KPMG opened 2 years ago
BTW, there are official rules for what's considered a re-exporting of symbols, which is followed by mypy
and pyright
for example: https://github.com/python/typing/blob/main/docs/source/libraries.rst#library-interface-public-and-private-symbols
This applies not only to __init__
though, so not sure if I should create a separate issue or not.
There was also some parallel discussion in gitter.
@pawamoy said:
Hey :) indeed currently Griffe only marks as "exported" the names listed in
__all__
. When checking if an object is considered public, these "exports" are then used to tell whether an object is explicitly exported. If there are no exports (__all__
is undefined), Griffe falls back on an implicit check, and the mere presence of an object is enough (given it is not private, i.e. it doesn't start with_
).I initially decided not to consider "from x import a as a" as explicitly exported, because:
- I don't like it :D
- It's ugly :D
- More seriously, we have one obvious way:
__all__
.- "as" imports makes you noqa: F401 the imports, while
__all__
is understood. Maybe linters can support "as" imports thoughIn any case, I didn't think of everything, so this can be revisited / improved. I don't obviously agree with Mypy's rules though. Mostly, I want to have something that makes sense, without confusing users (too much?) :)
So I have mainly 3 reasons why I think this should be supported despite your reasons:
mypy
thing, it is part of the typing PEP484 (even if it is mentioned only for stub files), this is an official thing, ignoring it doesn't make it go away.__all__
: from x import *
. For __init__.py
this makes a lot of sense in many cases, having to explicitly import and re-export every symbol in every sub-module is not only pure bloat, it is also error-prone, as if you create more symbols in a sub-module, you need to remember to go to the __init__.py
and re-export it. Unless I'm missing some technique to deal with this case more gracefully, in which case I would love to hear about it, if not as a workaround until from x import *
is supported by mkdocstrings ;)About other linters not supporting it, yes, I agree it is annoying, but because of the reasons above, I think they should be, and I'm willing to open issues in flake8
and pylint
if they can't deal with this properly.
Thanks for all the info!
Just to play devil's advocate a bit (don't read it as "I really don't want to support as
/*
imports", it's just to bring nuance):
https://github.com/python/typing/blob/main/docs/source/libraries.rst#library-interface-public-and-private-symbols Type checkers can use the following rules to determine which symbols are visible outside of the package.
Emphasis on "can". To me it sounds like type checkers have a choice in how they consider an object to be public/exported or not. If they want, they can follow these few rules described in the document, but apparently they don't have to.
This explicitly says it's for stub files. The only mention of non-stub files is:
Just like in normal Python files, submodules automatically become exported attributes of their parent module when imported.
So to me, this again is not an authoritative document to describe how to determine whether an object is public/exported or not. I don't think such a document exist yet.
Anyway. Even if we don't have specs, we have conventions, and from a import x as x
and from b import *
seem to be rather common conventions to mark x
and all public objects from b
as public. So let's try and support them :slightly_smiling_face:
About your third point, there is a way: if you define __all__
in your submodules, you can import it to extend a parent module __all__
list. Example:
package/
__init__.py
module.py
# module.py
__all__ = ["X", "Y", "Z"]
...
# __init__.py
# __all__ was by the way specifically made for wildcard imports
from module import *
# we could also avoid using the wildcard import,
# but I understand it's less convenient: from module import X, Y, Z
# import module's __all__!
from module import __all__ as _module_all
# use it to extend your parent __all__
__all__ = ["A", "B", "C", *_module_all]
# other operations are supported like:
# __all__ += _module_all
About the error-proneness of having to remember to explicitly import and add to __all__
, I'd argue that automatically importing and re-exporting everything with wildcard imports is even more error-prone, and much less clean from a public API point of view. I've seen monsters while working on Griffe, believe me. Explicit is better than implicit :blush:?
Anyway, I think wildcard imports are OK when combined with __all__
.
One more argument in favor of using __all__
when using wildcard imports.
If you use wildcard imports without __all__
to restrict the set of objects made available in the importing module, you will expose more objects than what the conventions suggest. The convention is only respected by static analysis tools, not runtime interpreters!
package/
__init__.py
module.py
import third_party # by convention, not exported
class A: # by convention, exported
...
# __init__.py
# by convention, only supposed to export A,
# but at runtime, third_party is exposed too!
from module import *
Anyway. Even if we don't have specs, we have conventions, and from a import x as x and from b import * seem to be rather common conventions to mark x and all public objects from b as public. So let's try and support them π
:tada:
About your third point, there is a way: if you define
__all__
in your submodules, you can import it to extend a parent module__all__
list.
Yeah, I thought of that, but that just means now I need to move all the duplication to the submodules, not that I get rid of the duplication. I can already mark symbols private in submodules by prefixing them with _
, a very widely-used convention (and also supported by all major type-checkers and linters).
If I can't get rid of the duplication, I rather have it in the __init__.py
file which is much more meta and avoid noise in the files that actually have the useful code.
But thanks for the suggestion!
About the error-proneness of having to remember to explicitly import and add to all, I'd argue that automatically importing and re-exporting everything with wildcard imports is even more error-prone, and much less clean from a public API point of view. I've seen monsters while working on Griffe, believe me. Explicit is better than implicit π?
I agree and can see that for monster legacy projects doing this is madness, but for a small greenfield project trying to use modern Python features, I think for __init__.py
files, it makes a lot of sense and makes things much simpler, again, just mark anything you don't want to export in a module as private and that's it, that is also better than having everything public anyway. If something is not intended to be public it shouldn't be in the first place :)
If you use wildcard imports without
__all__
to restrict the set of objects made available in the importing module, you will expose more objects than what the conventions suggest. The convention is only respected by static analysis tools, not runtime interpreters!
True. For the cases where you use a type-checker, you code will never run in runtime, so that issue should never happen too. This mostly applies to all static checks performed by a type-checker, unless you add runtime ininstance()
checks for every type hint you use, your code will still explode at runtime, so even when I see your point, when using type checkers I see this as a non-issue (maybe in terms of performance, I'm not sure if doing import *
could have some implication.
In any case, if this feature is controversial, or could be useful only to a subset of users, maybe it can be added behind an option to let users decide what's best for them.
Ha, this is not an easy topic. So many things to consider here.
Let me try to start again. First I'll describe what Griffe currently does. Then I'll share a few observations.
Griffe has a concept of "exported member". A module member is considered:
__all__
__all__
is undefined, the member is available at runtime (for example not guarded behind if TYPE_CHECKING
), and the member is an alias (defined elsewhere), except if it's a submodule, unless it was also imported in the current moduleA class member is considered implicitly exported, the same way as module members are considered implicitly exported. It is never considered explicitly exported because we don't/can't use __all__
in class bodies.
This "exported" marker is then used in a few places, 3 of of them being of interest to us here.
__all__
is defined in the wildcard imported module, add all objects referenced in the imported module's __all__
to the importing moduleI see a first issue here: members starting with _
are considered implicitly exported. This is wrong: actual wildcard imports will not import these objects at runtime.
Between the same old and same new module, if a member disappeared, and it was either a submodule, or it was exported (explicitly or implicitly), we consider its disappearance a breaking change. Again, it's wrong here to consider that a private object's disappearance makes a breaking change. I didn't spot this issue before because the code that checks breaking changes has an earlier condition on the object names: it skips private objects anyway.
Griffe objects have an is_public()
method that tell if an object is public or not. An object is:
public
attribute is true_
) and it's not importedGriffe gained this method later on, to be used by mkdocstrings-python.
A few observations.
I think Griffe's internals/API is a bit confusing. "Exported" is usually used to say "public" (we export an object to make it public), but in Griffe it's rather used to say "the object is exposed in regard to the wildcard import runtime mechanism". So maybe I should rename "exported" to "wildcard exposed".
Then we should fix the issue mentioned above about private members being considered implicitly exposed. And since wildcard imports are not configurable, we would drop the distinction between explicitly and implicitly exposed. An object is either exposed, or it's not.
Finally the code that checks breaking changes should use the "public" API instead of the "exposed" one.
Now to go back to the subject at hand: whether to support the two conventions from a import x as x
and from a import *
. These conventions ask us to consider x
as public, and all objects imported from a (following regular wildcard import mechanisms) public too, respectively.
Thanks to the above analysis, and the changes we will make, I think it is safe to support them :+1:
from a import x as x
was previously checked by, lets call it the Breaking Changes Detector (BCD), because it was "implicitly exported". Now it will keep being checked because it will be considered public :white_check_mark: from a import _x as _x
(previously worked for wrong reasons) :white_check_mark: from a import *
, the object expansion stays the same (though fixed not to include private members when __all__
is undefined), and the BCD just applies its logic to all expanded objects :white_check_mark: My comment above about convention vs runtime was just me being confused, or the official docs being unclear. "Imported by wildcard" and "considered public" are not the same thing, and we most probably don't ask type checkers to use the "public" logic when expanding wildcard imports.
Just a final note: Griffe is not a type-checker, so type-checker recommendations do not always apply to us. In this case it seems like we can apply them safely :+1:
Thanks for the very in-depth analysis! Indeed it seems like there is a lot to consider, but I'm glad it seems like these conventions fit into the current Griffe model. In particular I'm not sure I understand the difference between public, exported and exposed are this context.
Also now I'm curious about what this BCD you are talking about is! Is is a tool that one can use to check if some changes break the public interface? Or is it just some internal implementation detail from Griffe that's not exposed to "regular users"?
Just a final note: Griffe is not a type-checker, so type-checker recommendations do not always apply to us. In this case it seems like we can apply them safely π
I understand technically this is true, but if your documentation doesn't match what the type checker thinks, this documentation won't be all that useful. :innocent:
I understand technically this is true, but if your documentation doesn't match what the type checker things, this documentation won't be all that useful. π
Of course :smile: (Just to give one example, type-checkers will apparently only read stubs, while we have to read both regular modules and their stubs, then merge their contents :slightly_smiling_face:)
Also now I'm curious about what this BCD you are talking about is! Is is a tool that one can use to check if some changes break the public interface? Or is it just some internal implementation detail from Griffe that's not exposed to "regular users"?
Glad you asked :sunglasses: It's definitely exposed to users, see our docs here: https://mkdocstrings.github.io/griffe/checking/ :relaxed:
This is amazing, thanks a bunch!
Just for completeness, here is another more or less official mention to how public and private symbols should work for libraries:
Problem
The
__init__.py
is sometimes used to re-export functions and classes defined within sub-modules, generally for visibility within the library and also with the auto-completion. It would be nice if the documentation showed this as well.Solution
Ideally, I would like an option that shows explicit imports/re-exports located within
__init__.py
. Based on other linting conventions, the way this is done is to be explicit with anas
even if it is tautological, though may not always be.As for the implementation, I like the way Rust documentation shows re-exports in a section dedicated for re-exports. This makes it clear that the class or function is defined elsewhere but has been made available within the current namespace. Here's an example from the popular Clap library:
Alternatives
I tried looking at the options available, but could not identify anything that would provide the functionality I need.
Another alternative would be to have the
gen-files
script handle this, but the implementation does not appear to be easy.Extra
The ability to document re-exports I think could be implemented with the following options if there might be a need for that, though I struggle to see the use-case for any of them other than
explicit-init
.explicit-init
: This would only document explicit re-exports (i.e., ones that use theas
notation) within__init__.py
files.explicit-all
: This would document explicit re-exports, but within any Python file.implicit-init
: This would document all imports (whether they use the explicitas
notation or not), but only within__init__.py
implicit-all
: As withimplicit-init
, but for all files.