python-trio / trio

Trio – a friendly Python library for async concurrency and I/O
https://trio.readthedocs.io
Other
6.12k stars 335 forks source link

Use less magic in constructing our public API exports, to make trio more intelligible to static analysis #542

Open ksamuel opened 6 years ago

ksamuel commented 6 years ago

Despite the well defined __all__, several important tools (e.g: pylint, mypy, jedi) cannot understand the code structure of trio. They then report error in the user code.

E.G:

On the following snippet, pylint will report trio not having the run() member, jedi won't allow code completion, vscode intellisense won't let you go to definition and mypy won't report any error if I try to add +1 after run()

import trio
...
trio.run(parent)

I understand this is inconvenient, as those tools should be able to pick up the __all__ declaration and use that instead of requiring your to type it all again, but they don't. Since they are the most popular in the Python ecosystem, and are used by a lot of editors (vi, sublime text, vscode, etc) but also in tasks (tox, travis, git hooks, etc), it's important that they work.

For now, that means putting # noqa a bit everywhere or disabling a lot of warning, which is not practical.

parity3 commented 6 years ago

FYI, there has been some work on this in the past. Maybe there could be a program to read import modules, introspect and spit out a new set of sources that are more static?

python make-nice-for.py --tool=pylint --output-path=new-trio-sources/

njsmith commented 6 years ago

Yeah, we should probably give up on most of this cleverness and switch to doing the more boring approach.

There was some discussion of this in the chat today also, starting about here: https://gitter.im/python-trio/general?at=5b1842e099fa7f4c064e93a0

This is probably best handled as an incremental set of cleanups, since there are a lot of different subtle bits of trickery here. "Does this cleanup help pylint/mypy/pycharm/jedi understand what's going on" is probably a good metric to use to decide where to start...

Also relevant: #475, which I think is a change we probably should do, and will reduce the number of auto-generated wrapper functions in trio/_core/_run.py.

njsmith commented 6 years ago

This comment describes how to write a strong test for pylint being able to understand our public API: https://github.com/python-trio/trio/pull/594#issuecomment-415336167

Zac-HD commented 6 years ago

Closed by #594 :tada:

njsmith commented 6 years ago

I like your enthusiasm, but I'm afraid we're not done yet :-). #594 made the main trio namespace work with tab completion, but we still need to figure out what to do with the submodules, and think about whether we can get rid of any of the other runtime namespace manipulation that we do...

Zac-HD commented 6 years ago

Haha, I actually just went and updated my "add type hints" branch (https://github.com/python-trio/trio/compare/master...Zac-HD:type-hints)... it's still mostly about ignoring analysis failures, so I was definitely too enthusiastic :rofl:

njsmith commented 6 years ago

So if I understand the status correctly: the main trio, trio.testing, trio._core namespaces are now amenable to static analysis. Meaning: common tools can look at them and (a) generate a list of exported symbols, for tab completion, and (b) figure out where those symbols came from, so deeper analysis is possible if the place they're coming from is itself amenable to static analysis.

There are a few more public namespaces to look at:

Does that sounds like the current status?

Fuyukai commented 6 years ago

but there's no way

No way what?

Zac-HD commented 6 years ago

Yep, that's my understanding too.

trio.socket – this is messy right now. Probably we should replace most of the dynamic __all__ generation inside trio/_socket.py with a static list import list in trio/socket.py, similar to what we did for trio/__init__.py.

Sounds good to me. We might also need to make the namespace construction less magical, but we need static imports first anyway and can see what actually has to change after that.

The special case is constants like AF_INET, which more-or-less have to be dynamic. Maybe: move the dynamic re-export loop out of trio/_socket.py and into trio/socket.py; inside trio/_socket.py refer to these constants as _stdlib_socket.AF_INET etc. so mypy knows what's going on within that file anyway; and in trio/socket.py statically list some of the key constants like AF_INET, AF_INET6, etc.?

This also sounds good, though I'd be very cautious about statically listing dynamic constants - it might be the best way currently available, but we should leave some detailed comments about what's going on.

Maybe we should switch to static code generation?

Yes, but only if we have CI always check that there is no difference between the current output of the generator and the file tracked by git. This catches both stale files when the generator has been updated, and contributors trying to update the files by hand - both recipes for bugs down the line!

njsmith commented 6 years ago

@fuyukai good catch, thanks, I edited the post to add the missing words

Zac-HD commented 6 years ago

Just dropping a quick update on my type-hints branch: moving in the right direction, but plenty left to do.
Note that Mypy is only part of the scope of this issue!

Current Mypy output ``` trio/_path.py:183: error: "Type[Path]" has no attribute "absolute" trio/_path.py:187: error: "Type[_PathLike[Any]]" has no attribute "register" trio/_highlevel_open_tcp_stream.py:4: error: Module 'trio.socket' has no attribute 'getaddrinfo' trio/_highlevel_open_tcp_stream.py:4: error: Module 'trio.socket' has no attribute 'SOCK_STREAM' trio/_highlevel_open_tcp_stream.py:4: error: Module 'trio.socket' has no attribute 'socket' trio/_highlevel_open_unix_stream.py:3: error: Module 'trio.socket' has no attribute 'socket' trio/_highlevel_open_unix_stream.py:3: error: Module 'trio.socket' has no attribute 'SOCK_STREAM' trio/_highlevel_open_unix_stream.py:6: error: Module 'trio.socket' has no attribute 'AF_UNIX' trio/__init__.py:18: error: Module 'trio._core' has no attribute 'current_time' trio/__init__.py:98: error: Module has no attribute "__deprecated_attributes__" trio/testing/__init__.py:1: error: Module 'trio._core' has no attribute 'wait_all_tasks_blocked' trio/tests/test_util.py:90: error: Invalid base class trio/tests/test_socket.py:359: error: Module has no attribute "AF_INET" trio/tests/test_socket.py:360: error: Module has no attribute "AF_INET6" trio/tests/test_path.py:183: error: "Type[Path]" has no attribute "joinpath" trio/tests/test_highlevel_open_tcp_stream.py:6: error: Module 'trio.socket' has no attribute 'AF_INET' trio/tests/test_highlevel_open_tcp_stream.py:6: error: Module 'trio.socket' has no attribute 'AF_INET6' trio/tests/test_highlevel_open_tcp_stream.py:6: error: Module 'trio.socket' has no attribute 'SOCK_STREAM' trio/tests/test_highlevel_open_tcp_stream.py:6: error: Module 'trio.socket' has no attribute 'IPPROTO_TCP' trio/tests/test_highlevel_open_tcp_stream.py:101: error: Name 'trio.socket.SocketType' is not defined trio/tests/test_highlevel_open_tcp_listeners.py:165: error: Name 'tsocket.SocketType' is not defined trio/tests/module_with_deprecations.py:12: error: Module has no attribute "regular" trio/_subprocess/unix_pipes.py:60: error: "memoryview" has no attribute "__enter__"; maybe "__iter__"? trio/_subprocess/unix_pipes.py:60: error: "memoryview" has no attribute "__exit__" trio/_subprocess/unix_pipes.py:78: error: Module has no attribute "wait_writable" trio/_subprocess/unix_pipes.py:109: error: Module has no attribute "wait_readable" trio/_subprocess/linux_waitpid.py:52: error: Module has no attribute "spawn_system_task" trio/tests/test_highlevel_ssl_helpers.py:8: error: Module 'trio.socket' has no attribute 'AF_INET' trio/tests/test_highlevel_ssl_helpers.py:8: error: Module 'trio.socket' has no attribute 'SOCK_STREAM' trio/tests/test_highlevel_ssl_helpers.py:8: error: Module 'trio.socket' has no attribute 'IPPROTO_TCP' ```

So once those things are sorted out, we can run Mypy in CI! Then:

  1. remove as many type: ignore comments as possible (forcing the Any type hides errors elsewhere)
  2. ship the py.typed file, so downstream code can be checked for consistency with Trio's type hints
  3. add non-trivial type hints to Trio; profit.
Zac-HD commented 6 years ago

Gah, misclick - sorry!

Zac-HD commented 6 years ago

@njsmith says: Staircase review: [#656] should have fixed tab completion for trio.testing and trio._core... it'd be nice to extend the jedi/pylint tests to make sure that stays true :-)

Extending the tests from #594 to cover the changes in #656 would be a nice, self-contained contribution.

njsmith commented 6 years ago

Opened #699 to discuss the sub-issue of how to handle socket module constants.

njsmith commented 5 years ago

Updated todo list:

I think that's it for module-level exports. Then there are some classes that use dynamic shenanigans on the attribute level:

jmfrank63 commented 5 years ago

Hi I think I at least nominally covered the list. So if someone would like to take a look they are #751 #752 and #753 The tests seem to go fine, but we have to keep track of the symbols twice, once in the try/except import and second on the dynamic import. But since they are in one module next to each other that might be acceptable. We might add a comment for this.

njsmith commented 5 years ago

805 has landed! I think this means all our top-level stuff is now statically understandable, and all that's left are a few classes with dynamic attributes.

Can any of our static analysis experts here give us an update on what you think the next step should be?

jmfrank63 commented 4 years ago

@njsmith Just browsed again through the thread, but could not make out any task left. _abc.py seems to have been converted already as well as _ssl.py. Python is a great library, I would like to clean up if anything is left here. Otherwise I think I am turning to the docs and try to write some meaningful examples, looks like there are plenty of docs issues out there. I am not strong at this currently but I guess I'll grow taking on this. If you could point me to something that is either burning or considered a good issue I'd be glad.

njsmith commented 4 years ago

@jmfrank63 I think the next steps on static analysis stuff are:

sarnold commented 3 years ago

Just leaving a note here because I was trying to figure out why the urwid tests wouldn't run. The upshot is an attribute error: AttributeError: module 'trio' has no attribute 'hazmat' breaks the tests with 0.18.0 but 0.17.0 works fine.

njsmith commented 3 years ago

@sarnold I don't think that's related to the static analysis that this thread is discussion -- it's just that trio.hazmat got renamed to trio.lowlevel back in v0.15, and in v0.18 we removed the backcompat support. v0.15/v0.16/v0.17 gave a deprecation warning everytime you accessed trio.hazmat, so I guess the urwid devs didn't notice the warning? Running tests with -Werror or similar can help with that.

pquentin commented 3 years ago

(And there's even a urwid PR: https://github.com/urwid/urwid/pull/439)

ntamas commented 3 years ago

The Trio loop was contributed to urwid by me; I guess that the urwid devs don't use Trio regularly. I have sent a PR months ago but it was not merged at the time of writing.

takluyver commented 3 years ago

I've just run into the distinction between SocketType and _SocketType - there seems to be no way to use it as a type annotation so that PyCharm is happy.

Is the distinction just to prevent people constructing SocketType()? Would something like this be acceptable:

class _SocketType(SocketType):
    def __init__(self, sock, _internal=False):
        if not _internal:
            raise TypeError(...)

Of course this makes it easier for someone to do what they're not meant to, but hopefully it's still clear that that's 'off label' usage and not to be relied on.

njsmith commented 3 years ago

Is the distinction just to prevent people constructing SocketType()? Would something like this be acceptable:

No, we actually have a more standard solution for that. IIRC the weird SocketType/_SocketType contortions were to support our built-in network "monkeypatching" system: https://trio.readthedocs.io/en/stable/reference-testing.html#trio.abc.SocketFactory

So the goals were:

On further thought, maybe an alternative would be:

I guess this would need a deprecation b/c it would break any existing code that's doing class MyFakeSocket(trio.socket.SocketType), but the fix is trivial and I doubt there are many people doing that, so it could probably be a pretty quick transition.