python-trio / trio

Trio – a friendly Python library for async concurrency and I/O
https://trio.readthedocs.io
Other
6.1k stars 332 forks source link

Mock network for testing #170

Open njsmith opened 7 years ago

njsmith commented 7 years ago

@glyph gave a great talk at PyCon this year that involved using a virtual (= in memory, in python) networking layer to build a virtual server to test a real client.

As far as the virtual networking part goes, we have some of this, e.g. #107 has some pretty solid in-memory implementations of the stream abstraction. But it would be neat to virtualize more of networking, e.g. so in a test I can have tell my real server code to listen on some-server.example.org:12345 and tell my real client code to connect to that and they magically get an in-memory connection between them.

Fixing #159 would reduce the amount of monkeypatching needed to do this, but OTOH I guess monkeypatching the whole trio.socket module is probably the simplest and most direct way to do this anyway... or we could hook in at the socket layer (have it check a special flag before allocating a new socket) or at the high-level networking layer (open_tcp_stream checks a special flag and then returns a FakeSocketStream etc.). Fundamentally there's going to be some global state because no-one will put up with passing around the whole library interface as an argument everywhere, literally every async library has some kind of contextual/global state they use to solve this problem, and I can't think why it would matter a huge amount whether that's from twisted.internet import reactor vs asyncio._get_running_loop() vs trio.socket.socket(). So I'm leaning towards not worrying about monkeypatching. (The one practical issue I can think of is if someone is trying to use trio in two threads simultaneously, then this will cause some problems because the monkeypatch would be global, not thread-local. Maybe we can make it thread-local somehow? Or maybe we just don't care, because there really isn't any good reason to run your test suite multi-threaded in Python.)

Oh, or here's a horrible wonderful idea: embed the fake network into the regular network namespace, so like if you try to bind to 257.1.1.1 or example.trio-fake-tld then the regular functions notice and return faked results (we could even encode test parameters into the name, like getaddrinfo("example.ipv6.trio-fake-tld") returns fake ipv6 addresses...). Of course this would be a bit of a problem for code that wants to like, use the ipaddress library to parse getaddrinfo results. There are the reserved ip address ranges, but that gets dicey because they should give errors in normal use... In practice the solution might be to stick to mostly intercepting things at the hostname level (e.g. open_tcp_stream doesn't even need to resolve anything when it sees a fake hostname), though we do need to have some answer when the user asks for getpeername. I guess we could treat all addresses as regular until someone invokes this functionality with a hostname, at which point some ip addresses become magical.

BUT there would also still very much need to be a magic flag to make sure all this is opt-in at the run loop level, to make sure it could never be accidentally or maliciously invoked in real code, to avoid potential security bugs. At which point I suppose that magic flag could just make all hostnames/addresses magical. Oh well, I said it was a horrible (wonderful) idea :-). The bit about having hostnames determine host properties might still be a good idea.

There's also a big open question about how closely this API should mimic a real network. At the very least it would have to provide the interfaces to do things like set TCP_NODELAY (even as a no-op), for compatibility with code made to run on a real network. But there are also more subtle issues, like, should we simulate the large-but-finite buffers that real sockets have? Our existing in-memory stream implementations have either infinite buffering or zero buffering, both of which are often useful for testing, but neither of which is a great match to how networks actually work... and of course there are also all the usual questions about what's kind of API to provide for manipulating the virtual network within a test.

I suspect that this is a big enough problem and with enough domain-specific open questions that this should be a separate special-purpose library? Though I guess if we want to hook the regular functions without monkeypatching then there will need to be some core API for that.

Prerequisite: We'll need run- or task-local storage (#2) to store the state of the virtual network.

njsmith commented 7 years ago

Some notes:

I think the way to do this is to provide a hook for the socket and socketpair constructors, and for getaddrinfo and getnameinfo, and put the bulk of the code in a separate library. (The other constructors like from_stdlib_socket would remain unhooked. This is probably important for subprocess support, as well as being the Right Thing I think.)

The getaddrinfo hook should probably only be called after we check the numeric fast path and go through idna encoding, so it would always receive the hostname as a bytestring. (It can also receive the hostname as a numeric IP, iff the service is a symbolic name.)

A useful fake network probably needs to include features like:

I'm not sure to what kind of behavior plugin API we want. Something like the memory_stream objects, with callbacks that fire to pump data? An active "switchboard" that gets notified of connection attempts etc and then decides what to do? Just a small set of canned behaviors + for actual interaction you can put up a fake server to talk to with whatever complex behavior you want? (This could even use an extended interface, e.g imagine a version accept that let you peek at the connection attempt and then decide what to do with it.)

njsmith commented 7 years ago

It would be neat to have a built in randomized/flakey-network mode; think like a 1 button hypothesis test suite for network protocol annoyances.

Possibly this might work like: run the code once to calibrate how much communication happens, and then rerun N times with random delays injected, or if more ambitious and requested, with network failures.

One might want to have ways to tune the kind and distribution of random delays (breaking up data by introducing a virtual delay after every byte is a good stress test, but might be too slow for larger tests). Some kind of hypothesis-like tuning towards "weird" patterns would also be good (e.g. adding tons of delays on one connection but not another, or on one direction but not another).

It would also be neat if there were a way to write down a pattern of injected faults, to extract repeatable regression test cases from a randomized run, and potentially to allow particular fault patterns or fault pattern generators to be written by hand.

njsmith commented 7 years ago

I added hooks!

253 implements the core hooks needed to enable this. Specifically, I decided to allow hooking the main socket constructor (when fileno= is not given), the is-this-a-trio-socket logic, and getaddrinfo/getnameinfo.

Design notes on socket hooks

An interesting problem is what to do with the other socket constructors – socketpair, socket(fileno=...), fromfd, etc. I ultimately decided that:

If it turns out that being able to hook these "real socket" constructors is useful, then in the future we could add a wrap_real_socket method to SocketFactory, that gets called after the trio socket object is constructed and can wrap it or whatever you want. I guess this would only need to hook from_stdlib_socket since everything else funnels through there.

Design notes on hostname resolver hooks

It just lets you hook getaddrinfo and getnameinfo; see #254 for some discussion.

Currently, we optimize out the case getaddrinfo("127.0.0.1", 80) (skip calling the custom resolver), but we do call the custom resolver for getaddrinfo("localhost", 80) and for getaddrinfo("127.0.0.1", "http"). It might be nice to handle some more cases in the generic code, e.g. guarantee that they never see symbolic service names (because we resolve them first). I guess we could make sure they never see AI_NUMERICHOST or AI_NUMERICSERV too. But for now this just seemed too complicated -- to resolve the service name ourselves we'd need to use getservbyname in a thread, but it isn't thread-safe, and also it wants an argument "tcp" or "udp", even though these are supposed to always give the same results, and ... ugh, whatever, I decided to punt for now, and custom resolvers just have to be prepared to deal with this stuff as well as they're able. We can always tighten it up later (there's no backcompat break if we tighten things up so that the custom hooks see more regular input).

Next step

So the next step here is to make a package providing a sweet fake network simulator that hooks into trio.

dmcooke commented 7 years ago

As a suggestion, add a trio.abc.SocketType ABC, which trio._socket._SocketType should inherit from, and for which SocketFactory instances should register their socket class with.

(Alternatively to requiring registration, call SocketType.register on the type of socket returned by the socket factory in trio.socket.socket, as that's the only socket constructor; ABC registration is cached, and should be fast.)

This would allow isinstance(sock, SocketType) and issubclass(socktype, SocketType) to work as before.

I've been using functools.singledispatch for various tasks; it relies on issubclass for determining which function to dispatch to (the choice is cached, keyed by the object type). You can't do subclass-testing with only is_trio_socket; I would end up having to write the above ABC, and require users of my library to use my socket function instead of trio's, to ensure the socket class is registered. ('My users' = 'me', so it's not a big deal :-))

Additionally, I've also been playing around with mypy for static type checking; it really only understands isinstance for dynamic type refinement, so in this case you want to be able to do isinstance(sock, SocketType).

njsmith commented 7 years ago

@dmcooke: Oh huh yeah, that's a good idea. Any interest in putting together a PR? is_trio_socket etc. aren't in a release yet, so if we do it soonish we can get this into 0.2.0 (see #299) and avoid some churn.

I agree it's not quite 100% obvious what kind of checking we should do for class registration. I guess there are three options (am I missing any?):

Right now we just trust the SocketFactory implementor to implement their is_trio_socket method correctly. It's not 100% obvious that this carries over to an ABC-based design though, b/c right now the very existence of the is_trio_socket method forces SocketFactory authors to think about this issue, whereas with the ABC-based design there's nothing to remind you if you get it wrong.

Implicitly registering classes makes me nervous though, just on vague general principles. I guess a concrete case where it could go wrong would be if, like, someone forgets a return, and then NoneType ends up registered as a SocketType object, and then this causes some weird effect later?

It's true that implicitly registering and checking are about the same speed, because the first thing in ABCMeta.register is an if issubclass(new_subclass, self): return. They're both a bit slow (~800 ns on my laptop, versus ~80 ns for a regular isinstance check). But this is still pretty fast in the grand scheme of things.

SocketStream and SocketListener do have explicit checks that the object they get is a trio socket, so that might be enough to catch anyone who messes this up.


Here's another possible design to consider: we could say that SocketFactory has to provide an is_socket_class method, and then do something like:

class _SocketTypeMeta(type):
    def __subclasscheck__(self, class_):
        return class_ is _InternalSocketType or current_sfactory.is_socket_class(class_)

    def __instancecheck__(self, obj):
        return self.__subclasscheck__(obj.__class__)

class SocketType(metaclass=_SocketTypeMeta):
    pass

I guess the main advantage of this is that it maintains the invariant that only the currently registered SocketFactory's objects count as sockets. (It's also slightly faster, like ~2x, but this probably doesn't matter much. And actually in real code fetching the current factory out of thread-local storage probably makes it slower -- I didn't implement that in my toy benchmark.) Is that a useful invariant? Generally I think the pattern is that people will register a SocketFactory exactly once at the beginning of a run and then leave it in place until the end of the run. And if they don't, for some reason, then there could be socket objects floating around from the old factory, and maybe they should continue to count as socket objects? OTOH they probably shouldn't in new runs. Meh. It probably doesn't matter that much either way.

Also, it lets me skip worrying about whether the class should get moved to trio.abc. (The name is a bit meaningless really... it's to match socket.SocketType which is documented to be the type of stdlib socket objects, but (a) probably most users don't know this, and (b) the documentation is a lie anyway – the actual type is socket.socket, so the real experts also don't know it.)

njsmith commented 7 years ago

On further thought... let's keep it simple and just do:

class SocketType:
    def __init__(self):
        raise TypeError("use trio.socket.socket() to create a SocketType object")

and then the way you mark your class as being a trio socket is you just write class MySocket(trio.socket.SocketType). It's not fancy but it gets the job done.

njsmith commented 7 years ago

Switching from is_trio_socket to isinstance(..., trio.socket.SocketType) is done in #326.

njsmith commented 6 years ago

This is a really excellent talk on using simulated networks etc. for robustness testing in a real system: https://www.youtube.com/watch?v=4fFDFbi3toc

asmodehn commented 6 years ago

Related to this topic, I am trying to mock sockets with mocket (see https://github.com/mindflayer/python-mocket/issues/74#issuecomment-410617959), but I cannot pass this typecheck : https://github.com/python-trio/trio/blob/master/trio/_core/__init__.py#L43

Shouldn't this part also use a isinstance() instead of type()== to at least be able to make use of the inheritance tree ? Is there already a supported way to mock sockets when using trio that I am just not aware of ?

njsmith commented 6 years ago

The reason for the exact type check is that Trio needs the socket that's passed in be a real socket that it can pass into low-level OS syscalls – merely implementing the Python-level socket API isn't enough. For example, if we used isinstance to check, then it would let people pass in a ssl.SSLSocket object, and that wouldn't work at all, because it quacks like a socket at the Python level but you can't expect to pass it into epoll and get something sensible out.

However, looking at the mocket issue thread, that's not your problem... that check makes sure that the object passed in matches socket.socket. You're passing in a mocket object... and mocket has monkeypatched socket.socket so that it refers the type of mocket objects, and that test passes :-). The actual problem is a little more subtle: a trio socket object has lots of methods that just wrap stdlib socket methods (e.g. await triosock.send(...) calls stdlibsock.send(...) with some boilerplate around it to integrate it into the trio I/O loop). Most of these methods are generated at import time using some tricky metaprogramming, via this function:

https://github.com/python-trio/trio/blob/eeafa1e34550b52f029e8c26cd78222120d9c8aa/trio/_socket.py#L614-L615

And that function actually looks up the socket.socket.send method at import time (that's the getattr call there) and saves the method object, so it doesn't have to look it up again later. So when you call await triosock.send(...) that always calls the real stdlib send method, and then that fails because the real stdlib send method has no idea what to do when it's called on a mocket object.

This is kind of tricky to fix... in theory we could switch to looking up the method on each call, instead of doing it once at the beginning, but that would add extra per-call overhead to some of the most performance-sensitive methods in trio, even for all the people who aren't trying to use mocket. Maybe the difference wouldn't be that bad?

Is there already a supported way to mock sockets when using trio that I am just not aware of ?

Trio does have a standard, supported API to integrate with mock socket libraries: see set_custom_hostname_resolver and set_custom_socket_factory (docs). However, this works at a slightly higher level than what mocket wants: trio's mocking API lets you set up to use a mock trio socket object (i.e., an object that has async methods), while mocket wants to define a mock stdlib socket object (i.e., an object with sync methods). The advantage of doing it trio's way is that it makes it easy to do things like introduce random network delays, and block on receive without tricky hacks. The downside is that while you can use this API to implement a socket mocking library, I don't think anyone has actually done that yet :-/

mehaase commented 5 years ago

Oh, or here's a horrible wonderful idea: embed the fake network into the regular network namespace, so like if you try to bind to 257.1.1.1 or example.trio-fake-tld then the regular functions notice and return faked results (we could even encode test parameters into the name, like getaddrinfo("example.ipv6.trio-fake-tld") returns fake ipv6 addresses...).

There is a TLD reserved for testing purposes: .test. This might be pretty surprising the first time somebody stumbles across it, but a nested domain like .trio.test would help clarify that it is not a normal domain.

There are the reserved ip address ranges, but that gets dicey because they should give errors in normal use...

Are you thinking of the RFC-5737 reserved blocks? I don't actually get any errors when I try using them (i.e. ping or nc to an address in the block), except for the expected timeout. Even though they are not assignable blocks, the Linux kernel still treats them like routable addresses. So I don't think Trio would break anything if it intercepted some of those addresses and treated them specially.

Although as you point out, there's no need to do this, and maybe it's better to intercept based on hostname alone. I guess this is a design decision that can be made by anybody who wants to implement a mock network library. Do you know of anybody actively working on such a library?

njsmith commented 5 years ago

@mehaase Yeah, don't read too much into my year-and-a-bit-ago initial thinking-out-loud :-). At that point I hadn't even figured out yet mock networking was something that be built into trio's regular APIs (in which case it would need to be triggered by some kind of magical arguments), or something that you had to turn on. Now we've settled on it being something you turn on, so that discussion is much less relevant.

I guess this is a design decision that can be made by anybody who wants to implement a mock network library.

Yeah, exactly.

Do you know of anybody actively working on such a library?

Not currently, no!

njsmith commented 5 years ago

Some existing popular mock libraries that might be useful for inspiration (or at least narrowing down use cases):

For these kinds of use cases, it sounds like all you really need is a way to wire up a virtual network, so you can run a virtual web server or something and speak HTTP at it. (getaddrinfo resolves hosts to random-but-consistent addresses, servers can bind to any address, connect hooks you up with whatever server is bound to that address, etc.) I guess that would already be useful, even without fancy stuff to simulate adverse network conditions.

njsmith commented 5 years ago

Note: as soon as this is working, the next thing people will run into is how to get TLS working over the virtual network. trustme gives a lot of the core tools you need, but ideally the UX would be that when you drop in your pytest fixture to enable virtual networking, it also configures that test to trust a fake temporary CA, and makes it easy for the mock servers to grab certificates using that CA. The main obstacle here is that there isn't any central way to configure CA trust in Python... I don't especially want to have to invent one, but we'll see...

njsmith commented 1 year ago

Here's a fake network crate for tokio; might have ideas we can steal: https://docs.rs/turmoil/