Cannot test TLS or TLS-adjacent platform features

davidben commented 5 years ago

Filing this issue per @foolip's request since this routinely comes up. I don't have permissions to add labels, so could someone add type:untestable to this?

WPT doesn't have good mechanisms for testing network-related web platform features that sit below the HTTP abstraction. In particular, TLS and TLS-adjacent features are untestable by WPT.

To test something, at minimum the server must enable it. wptserve uses Python with a dependency on the system environment. New features get into OpenSSL, then OpenSSL makes a feature release, then Python wraps the API and make a feature release, then OS vendors pick up those releases, then OS vendors make a feature release, then old OS releases cycle out. This is a huge pipeline. wptserve still doesn't reliably have the 11-year-old TLS 1.2 available, despite TLS 1.0/1.1's planned removal.

But enabling it isn't enough. Testing also requires invalid values (invalid signatures, etc.) or controlling the timing of various events. For instance, 0-RTT is inherently a race condition. Depending on how preconnect and certificate verification resolve, 0-RTT may not happen because the optimization was irrelevant. To test it reliably in BoringSSL, our test harness waits for data before sending ServerHello, though the spec describes this as invalid because it can deadlock.

TLS is a protocol and not just headers + data, so the kinds of behaviors we need are more complex. In BoringSSL, we have a custom TLS stack that we patch quirks in as need. In general, to test a particular layer, you need to break it apart. Testing HTTP parsing requires custom HTTP serializations, etc.

Python isn't a good vehicle for that. Network protocols need crypto. Python does not have good crypto support in the standard library. While one could implement it in pure Python, that wouldn't meet production performance and security needs, so no one does it outside of toy libraries. Instead, people wrap over native code, which brings in the system dependency issues. Python also lacks a good networking and concurrency story, though I hear it's gotten better with Python 3? (I'm mostly familiar with Python 2.)

I would advocate Go here. It is reasonably high-level (let's not write C++ test servers), has a concurrency story, and solid networking, crypto, and TLS support in the standard library. It is also performant enough that, where features are missing, implementing them in Go is plausible. Indeed the Go crypto and TLS implementations do not have system dependencies, so we have less of a pipeline problem.

foolip commented 5 years ago

Paging @web-platform-tests/wpt-core-team for thoughts. See also https://github.com/web-platform-tests/wpt/issues/8391, although I don't think we have a clear idea of how slow compared to other stacks and why.

davidben commented 5 years ago

Oh, interesting. When I say Python is too slow for crypto, I usually mean that it's unsuitable for production uses. That may not directly affect WPT, but it has the second-order effect that the Python crypto ecosystem is largely native bindings, so WPT cannot draw on that.

But, yeah, if WPT is already slow even with native bindings for TLS, adding a pure-Python TLS with pure-Python crypto primitives will make things even worse. :-)

stephenmcgruer commented 5 years ago

I'm marking this priority:roadmap, because I think this is something we need to consider some sort of story for in 2020. The ultimate answer might be that TLS/TLS-adjacent platform features are out of our scope, or it might mean developing a better network stack for wpt, but we need to figure out what that answer is.

I believe @Hexcles mentioned having had historical thoughts on this, which they may already have written up elsewhere.

zcorpan commented 5 years ago

Switching away from Python for wptserve I think would also impact https://github.com/web-platform-tests/rfcs/pull/23 (cc @louaybassbouss)

jgraham commented 5 years ago

We had this conversation at TPAC relating Quic [1].

The summary of that discussion is that whilst people would like to move away from Python for various things (I'd love to rewrite some components in Rust, for example; the manifest update is a clear example of where we've reached the limits of pure-Python performance), there are significant deployment challenges with adding compiled components to the testsuite, particularly if they aren't in C/C++, since browser vendors don't uniformly support the same set of compilers (e.g. Gecko has great Rust support and afaik no support whatsoever for Go). So making this kind of change requires a very compelling story for how to distribute it so it can work on vendor infrastructure.

[1] https://www.w3.org/2019/09/17-testing-minutes.html#item05

gsnedders commented 4 years ago

Above and beyond @jgraham's comment above:

WPT doesn't have good mechanisms for testing network-related web platform features that sit below the HTTP abstraction. In particular, TLS and TLS-adjacent features are untestable by WPT.

I think we need to have a clear definition of what we expect the scope for WPT to be.

Originally, it didn't include any network layer tests (okay, it relied on HTTP semantics in places, but certainly nothing lower level). Later we added tests for an API that was directly wedded to a protocol (WebSocket) by introducing a server for that. HTTP/2 has a very long and slow history here (needed for testing some things around Fetch), WebTransport and QUIC were raised at TPAC…

From my point of view, it's not clear that it necessarily makes sense to test anything above TCP/UDP all within one repo. As you note, we're very ill-equipped to test anything below the HTTP semantics layer, and to me it's not obvious it necessarily makes sense to include that all here. You need very different infrastructure to break apart the lower layers (and we absolutely should have shared tests there, it's just not clear to me that it makes sense to try and mold that into the infrastructure we already here).

Of course, yes, to some degree certain error conditions from lower layers propagate up (especially with lower level APIs like WebTransport), but the thought of taking on a whole load of extra infrastructure code (given we'd presumably end up with everything above TCP/UDP in this repo) scares me given historically extra infra has been maintained by some sort of vague notion of a WPT infra team (which… doesn't actually exist) and it's not clear those currently in that team have the necessary expertise or time to maintain all that.

davidben commented 4 years ago

Error conditions is not an accurate characterization of the scope here. Random web platform features may cross the divide. For instance, the resource timing APIs need to have defined interaction with 0-RTT, which is where this most recently came up.

If the answer is any web platform feature which interacts with modern features of bits under HTTP is out of scope for WPT, that's a fine answer. It's how we've been operating all this time, treating the WPT aspects of launch processes as yet another translation failure when those processes are applied to network-related features. But this keeps coming up and @foolip requested I file a bug, so here is a bug. Hopefully we can get this written down once and we don't have to repeat this conversation all the time. :-)

gsnedders commented 4 years ago

FWIW, I'm not opposed to having it all in WPT, we just need to address who will maintain the requisite infrastructure and how we'll get that running across all browser CI systems.

jugglinmike commented 4 years ago

I think we need to have a clear definition of what we expect the scope for WPT to be.

Originally, [...]

Well-said, @gsnedders

jgraham commented 4 years ago

I'm not so concerned about scope creep. I think the point of web-platform-tests is to have a shared testsuite for the features that are required to implement a web-compatible browser. There's no expectation that only includes things above the network layer; it's just that those were the lowest hanging fruit when the project started. Indeed one of the key design goals of wptserve was to allow writing tests which interacted with the HTTP layer in ways that wouldn't be possible in normal production servers. As the platform has exposed more of the lower layers of the stack it's only natural that we should have the same requirements there. Punting on the problem and encouraging people to build their own testsuite is bad; third party suites usually can't share infrastructure (e.g. two-way sync, wpt.fyi) and so often end up not providing the same level of ongoing quality assurance that a wpt-intergrated suite might.

That said, I reiterate that whilst I don't think there are philosophical concerns with adding infrastructure to enable tests covering these features, there are more than enough pratical challenges to make up the difference :)

foolip commented 4 years ago

Thanks @jgraham that all aligns with how I see this. In principle I'd be happy for one test suite for everything that can affect web developers, but how to actually test TLS or TLS-adjacent I don't know.

web-platform-tests / wpt

Cannot test TLS or TLS-adjacent platform features #20159