Handling of "if TYPE_CHECKING:" blocks

mthuurne commented 4 years ago

It is sometimes necessary to have code in a module (or more rarely, in a class) that is only there for the benefit of a static code analyzer and is skipped at runtime. In particular, there is the constant typing.TYPE_CHECKING that type checkers can evaluate to True, while its runtime value is always False. This is supported by mypy and possibly other analyzers.

A common purpose of if TYPE_CHECKING: blocks is to contain imports that are only necessary for the purpose of type annotations. These might be circular imports at runtime. When these types are used to annotate arguments or return values, they are relevant to pydoctor as well: either to extract type information from annotations or because of L{name} tags inside the docstring, where pydoctor has to be able to look up the fully qualified version of name.

In other cases, if TYPE_CHECKING: blocks are used to perform trickery to make mypy accept code that is difficult to analyze. In these cases, pydoctor can be better off analysing the runtime version of the code instead. For details, read the comments in this Klein PR.

A related issue is that pydoctor doesn't handle duplicate definitions particularly elegantly, see #33. For example, if both the if and else branch define the same name, one of them will have a name with ␣0 (space + zero) appended and that patched name not only shows up visibly in the generated documentation, it also breaks hyperlinks to the corresponding pages, since it not only escapes the link itself (for example, klein.resource%200.html), but it also escapes the file name: it generates a file named klein.resource%200.html instead of klein.resource 0.html.

On the one hand, it would be nice to do the right thing in as many cases as possible. On the other hand, perfect code analysis is impossible and trying too hard to handle complex code will consume a lot of effort and complicate pydoctor's code. So we have to find a balance somewhere that makes pydoctor perform in a simple and predictable way that we can communicate to the end user and that they can adapt their code to.

tristanlatr commented 2 years ago

What about we introduce the following new option:

--eval-if="klein:typing.TYPE_CHECKING; False"

We would parse the string mod:TYPE_CHECKING;False into mod and TYPE_CHECKING;False, then use ast to parse the python code into a two statements, convert the first statement to string with astor, check that the second statement is a boolean. Record this into a dict[str, bool] somewhere, and at the time of visiting the ast.If node, convert the ast.If.test node to string and check whether it matches any of the custom values. This would allow us to write something like:

--eval-if="**:sys.version_info < (3,0); False"

For now, only False value would be actually useful, but with time, we might have a more powerful inference system and this design would allow us to write:

--eval-if="**:sys.version_info; (3,9)"

Which would be very useful to consider the whole project under a certain python version.

But having the feature to compare as string and store only booleans is still a good step forward. It will fix this issue for sure.

The question is does unparsing all ast.If.test expressions in a big project like Twisted is ok in terms of performance?

tristanlatr commented 2 years ago

Or maybe these configuration should only be accessible by overriding a system class variable at first. Since this will be an experimental feature that could be changed down the road.

class System(model.System):
    eval_if = {"**":{"typing.TYPE_CHECKING":False}}

tristanlatr commented 2 years ago

So we have to find a balance somewhere that makes pydoctor perform in a simple and predictable way that we can communicate to the end user and that they can adapt their code to.

What about the following rationale? @mthuurne

All statements in the 'orelse' block of If nodes and statements in the except handlers of Try nodes that would override a name already defined in the main 'body' (or Try 'orelse' or 'finalbody') are ignored.

Meaning that in the context of the code below, 'ssl' would resolve to 'twisted.internet.ssl':

try:
    from twisted.internet import ssl as _ssl
except ImportError:
    ssl = None
else:
    ssl = _ssl

See #589 for tests.

Then, we can teak TYPE_CHECKING on a case by case basis with eval_if.

tristanlatr commented 1 year ago

The workaround, which should be clearly documented is to use if not TYPE_CHECKING: condition when pydoctor should analyze the runtime version of the code. I believe this will not confuse mypy, if it does, a bug report should probably be open .

tristanlatr commented 11 months ago

Is this the only reason klein’s API docs are generated with Sphinx instead of pydoctor @glyph ?

glyph commented 11 months ago

Is this the only reason klein’s API docs are generated with Sphinx instead of pydoctor @glyph ?

I honestly didn't even realize that Klein's API docs were being generated by Sphinx :). I didn't set that up.

Happy to switch to Pydoctor if someone can do the PR.

tristanlatr commented 11 months ago

Correction: I’m not sure Klein’s API docs are even generated at all. Can’t find a link to the docs. @wsanchez

glyph commented 11 months ago

Correction: I’m not sure Klein’s API docs are even generated at all. Can’t find a link to the docs. @wsanchez

OK, in that case, let's just set up pydoctor. 90% sure that Klein's docstrings are epytext. At least, the ones I've written are :)

wsanchez commented 11 months ago

Klein uses epytext and the apidocs tox environment I added in klein#315 uses pydoctor.

I never understood Sphinx, so I never understood how the docs and docs-auto environments really work, or how docs get to RTD; never tinkered with that stuff significantly.

tristanlatr commented 11 months ago

I’ll open a PR to integrate pydoctor with the Klein’s Sphinx build.

twisted / pydoctor

Handling of "if TYPE_CHECKING:" blocks #174