encode / httpx

A next generation HTTP client for Python. πŸ¦‹
https://www.python-httpx.org/
BSD 3-Clause "New" or "Revised" License
13.25k stars 839 forks source link

HTTP/3 support. #275

Open jlaine opened 5 years ago

jlaine commented 5 years ago

It would be nice to start planning support for HTTP/3. aioquic provides a sans-I/O API for HTTP/3 similar to h2 which would make such an integration possible.

The main hurdle is that the connection model is very different to HTTP/1.1 and HTTP/2 : we cannot simply establish a (reader, writer) pair over a TCP connection and pass this to the protocol layer. Between the HTTP/3 layer and the UDP transport, we have a QUIC layer which returns (datagram, addr) tuples indicating where datagrams need to be sent:

https://aioquic.readthedocs.io/en/latest/quic.html#aioquic.quic.connection.QuicConnection.datagrams_to_send

Note that the destination address may change during the connection's lifetime, for example in response to receiving a PreferredAddress from the server during the TLS handshake.

In order to provide a first demonstration, @tomchristie suggested I write a dispatcher which makes use of aioquic's HTTP/3 support. The code is still a bit rough, but it works:

https://github.com/aiortc/aioquic/blob/master/examples/httpx_client.py

This can be run as:

python examples/httpx_client.py https://cloudflare-quic.com/

See also https://github.com/encode/httpcore/issues/173

jlaine commented 5 years ago

I'm not too sure how to move this forward. Ideally I'd like to have a WIP pull request which we can rebase / amend until we're happy with the result. However I haven't really got a feel for how the code should integrate with httpx's connection management. Any hints on how to approach this problem?

sethmlarson commented 5 years ago

We'll need to add UDP support to our backend interface, that's definitely step 1.

After that we need to see how we want to implement client storage so we can remember things like Alt-Svc headers between requests.

Next would be landing an HTTP/3 specifier along with the H3Connection dispatcher. I'm not sure how compatible aioquic is with an SSLContext object (probably not at all?) so that will take some tweaking. And because HTTP/3 is known to not work in some routes we'll have to implement "fallback to TCP and forget the H3 alt svc" on the HTTPConnection object somehow.

Anything I'm missing @encode/httpx-maintainers?

florimondmanca commented 5 years ago

I don't know enough about HTTP/3 to tell (for example, I'm not sure what the differences between QUIC and HTTP/3 and the associated aioquic APIs are?), but the approach highlighted by @sethmlarson sounds sensible to me.

One thing I'm sure of is that we'll need to chunk this into multiple PRs, otherwise we risk ending up with a 1000 LOC+ PR that's a pain for everyone to look at. That way we can more easily track progress too.

We'll need to add UDP support to our backend interface, that's definitely step 1.

Yup, and UDP support is definitely going to be a big chunk.

HTTPX is currently very much coupled to TCP. In particular, right now a given HTTPConnection doesn't know about other connections, but from what you've said it seems like a QUIC "connection" needs to know about the surrounding connections to do packet routing (though it might be a defining property of UDP?), correct? Anyway, there'll be some refactoring needed before supporting UDP to clarify what is TCP-specific and what is not.

After that we need to see how we want to implement client storage so we can remember things like Alt-Svc headers between requests.

I think this should be an improvement over a basic design, right? (Under the hypothesis that remembering Alt-Svc headers will only allow us to skip the protocol switching phase of establishing an HTTP/3 connection.)

So, in terms of planning, I tried to think about the various steps in a bit more detail. I could see something like:

  1. Rename BaseStream / Stream / ConcurrencyBackend.connect() to BaseTCPStream / TCPStream / ConcurrencyBackend.connect_tcp().
  2. Add UDP support. This will most likely include building an UDP version of TCPStream and the associated methods on ConcurrencyBackend. At this point, the resulting updated ConcurrencyBackend (with an asyncio implementation) can be unit-tested against a very basic asyncio UDP server.
  3. Add httpx.dispatch.http3.HTTP3Connection, an HTTP/3 implementation via aioquic. To reduce the scope of this step, we can set some constraints:
    • HTTPConnection.connect() should try to connect via HTTP/3 and create a UDPStream / H3Connection if and only if HTTP/3 is one of the given http_versions (e.g. Client(http_versions=["HTTP/3"]). This way, we don't need to think about Alt-Svc and the switch between TCP and UDP yet.
    • All QUIC connections can be stored on H3Connection directly (similar to how we have one h11.H1Connection on our H1Connection class), without thinking about sharing those connections yet (provided we need to share those at all?).
  4. Add Alt-Svc support: always connect via TCP first, and only if Alt-Svc is received then establish a newH3Connection. (Will this require to modify HTTP11Connection and HTTP2Connection so that they are able to process Alt-Svc header in the response? Is raising an exception in HTTP11Connection.send() and HTTP2Connection.send() and catching it in Connection.send() a sensible enough way of signalling we should reconnect via UDP-HTTP/3?)
  5. Optimize connection making by remembering Alt-Svc headers.
  6. Implement sharing of QUIC connections between HTTPConnections β€” if relevant?

Some resources I've come across:

sethmlarson commented 5 years ago

Lots of great info @florimondmanca!! πŸŽ‰

I'm going to chime in here again at the idea of using AnyIO and adding our stream interface to that?

florimondmanca commented 5 years ago

I'm going to chime in here again at the idea of using AnyIO and adding our stream interface to that?

As in, using AnyIO (see #296) to provide an ConcurrencyBackend.connect_udp() implementation for all backends on top of their UDP sockets API? I'm not against the idea but it might be a bit too much for this particular issue. It could be a preliminary step but we haven't decided yet on whether we should switch over to AnyIO.

Edit: also, since AnyIO mandates the same "strict context management everywhere" approach than Trio, it's highly probable we'd need to refactor some internals to comply with that requirement. One item in particular is the Stream interface, which trio/AnyIO primarily expose as a context manager, although it's possible to .aclose() manually.

jlaine commented 5 years ago

HTTPX is currently very much coupled to TCP. In particular, right now a given HTTPConnection doesn't know about other connections, but from what you've said it seems like a QUIC "connection" needs to know about the surrounding connections to do packet routing (though it might be a defining property of UDP?), correct? Anyway, there'll be some refactoring needed before supporting UDP to clarify what is TCP-specific and what is not.

I don't see why the different connections should have any kind of coupling. The most straightforward approach is going to be opening a distinct UDP socket for every QUIC/HTTP3 connection so there will be a one-to-one mapping between socket and QUIC/HTTP3 connection.

At a high level, an HTTP3 connection is going to be very similar to an HTTP2 connection: you can run any number of requests on top of your connection.

I think this should be an improvement over a basic design, right? (Under the hypothesis that remembering Alt-Svc headers will only allow us to skip the protocol switching phase of establishing an HTTP/3 connection.)

HTTP3 support has landed in cURL and more recently in Chrome/canary so I'd suggest looking into how they handle Alt-Svc.

Some resources I've come across:

You might want to add aioquic's demo HTTP client:

https://github.com/aiortc/aioquic/blob/master/examples/http3_client.py

cdeler commented 4 years ago

I tried to follow the guide written by @tomchristie above.

The first step was

Rename BaseStream / Stream / ConcurrencyBackend.connect() to BaseTCPStream / TCPStream / ConcurrencyBackend.connect_tcp().

It has been implemented in https://github.com/encode/httpx/pull/339 Having found the PR and the changes, I wanted to look around it and found that neither BaseTCPStream nor TCPStream nor ConcurrencyBackend are presented in httpx/httpcore repos. Moreover the file structure of he repo has significantly changed from the PR.

@florimondmanca , @tomchristie Do you have a time to provide us with advice, where the process should start? Should it be some sort of new connection to httpcore._async/httpcore._sync with the aioquic backend?

Update Probably the first step might be to add "open_quic_connection" to existing backends?

tomchristie commented 4 years ago

@cdeler Well, it's about as complex as you could get for a contribution, but if you're up for it then I can certainly put in the guidance for a sensible way to approach it.

The keyword here, as with anything like this, will be incremental. πŸ˜€

At that point we've got a stub behaviour for detecting HTTP/3 support, which we can start to iterate on.

cdeler commented 4 years ago

The keyword here, as with anything like this, will be incremental. πŸ˜€

it's smart behaviour to make incremental changes, I'm happy to try doing that

tomchristie commented 4 years ago

Slight update - let's scratch the second part of that, actually I think we'll want to use the HTTP/2 frame-type ALTSVC to detect if we should upgrade or not.

We'll use that because we don't want to wait for response headers before we decide on if we should upgrade. The ALTSVC frame can be sent during the opening handshake, before the request itself is made. We'll probably need to end up doing some investigation into exactly when implementations choose to send this frame, before we're able to proceed from there.

tomchristie commented 4 years ago

It's looking to me like httpx should never end up making an HTTP/3 request on an initial outgoing request, because either:

So I think the best we'll be able to do is storing altsvc information whenever it comes through, and potentially making subsequent requests over HTTP/3 using that information.

cdeler commented 4 years ago

@tomchristie

Have you seen this example in the aioquic repo?

florimondmanca commented 4 years ago

@cdeler β€” Sure, it was posted by Jeremy (OP) in the issue description. :-) Obviously looks outdated by now since some bits of HTTPX API have changed, but I'm sure it's been discussed earlier in this thread?

sla-te commented 3 years ago

Any way to make this work again? Currently getting NotImplementedError

jlaine commented 3 years ago

I've submitted https://github.com/aiortc/aioquic/pull/203 to update the httpx demo for recent httpcore versions.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

unixfox commented 2 years ago

Bump still important.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

unixfox commented 2 years ago

Bump, still relevant.

ChronoBrake commented 2 years ago

Any updates?

crypt0miester commented 2 years ago

bump

sla-te commented 2 years ago

+

tomchristie commented 2 years ago

So, I'm absolutely in favour of someday having support for HTTP/3, yup.

But also... "Any updates?", "bump", "+" aren't valuable in any way.

More interesting would be illustrations of genuine motivations for wanting HTTP/3 support. What context are you using it in & why do you want a client that supports it? I'm interested in cases that go beyond "I want it because it exists".

unixfox commented 2 years ago

More interesting would be illustrations of genuine motivations for wanting HTTP/3 support. What context are you using it in & why do you want a client that supports it? I'm interested in cases that go beyond "I want it because it exists".

Well at SearXNG we would want it in order to decrease the loading time when fetching some data from a website that support HTTP3. This would make the experience much snappier for the users of SearXNG.

crypt0miester commented 2 years ago

So, I'm absolutely in favour of someday having support for HTTP/3, yup.

But also... "Any updates?", "bump", "+" aren't valuable in any way.

More interesting would be illustrations of genuine motivations for wanting HTTP/3 support. What context are you using it in & why do you want a client that supports it? I'm interested in cases that go beyond "I want it because it exists".

I am a solana dev. they just announced that they are using quic as an option.

I am a contributor to the solana python library. I also build stuff on top that.

I tried using aioquic. it works, but the latency is worse than using a regular http/1.1 using httpx.

~1.4s on aioquic, and ~0.7s on httpx. (for two concurrent request from initial request time, until closing the connection.)

I don't know if the reason is because aioquic has to bind to the raw ip socket first.

tomchristie commented 2 years ago

I tried using aioquic. it works, but the latency is worse than using a regular http/1.1 using httpx.

This doesn't surprise me. My expectation is that in Python, using HTTP/3 will generally be slower, because of the increased complexity.

I can get that it might be useful for us to include for general debugging/tooling purposes, tho.

crypt0miester commented 2 years ago

I tried using aioquic. it works, but the latency is worse than using a regular http/1.1 using httpx.

This doesn't surprise me. My expectation is that in Python, using HTTP/3 will generally be slower, because of the increased complexity.

I can get that it might be useful for us to include for general debugging/tooling purposes, tho.

once a connection has been established though, it works amazingly. bulk requests is excellent using quic.

jlaine commented 2 years ago

I tried using aioquic. it works, but the latency is worse than using a regular http/1.1 using httpx.

This doesn't surprise me. My expectation is that in Python, using HTTP/3 will generally be slower, because of the increased complexity.

I can get that it might be useful for us to include for general debugging/tooling purposes, tho.

This does surprise me, measurements I made show good connection setup time and throughput against most servers.

crypt0miester commented 2 years ago

I tried using aioquic. it works, but the latency is worse than using a regular http/1.1 using httpx.

This doesn't surprise me. My expectation is that in Python, using HTTP/3 will generally be slower, because of the increased complexity.

I can get that it might be useful for us to include for general debugging/tooling purposes, tho.

This does surprise me, measurements I made show good connection setup time and throughput against most servers.

were your servers local or on the cloud handling hundreds, possibly thousands of requests per second sir? the post requests made is to a solana rpc.

perhaps I am doing something wrong.

jlaine commented 2 years ago

were your servers local or on the cloud handling hundreds, possibly thousands of requests per second sir? the post requests made is to a solana rpc.

perhaps I am doing something wrong.

The servers were distant, the results I mention are runs of aioquic's interop suite against a variety of servers, and the acceptance criterion is that the time to download 5MB and 10MB files over HTTP/3 must no more than 10% over HTTP/1.1 or HTTP/2 (using httpx). I'm not sure the server's load is relevant here as we are talking about client performance.

I don't think you're doing anything wrong, there are definitely some reports of initial connection time outliers using aioquic, which can kill the measured performance. I haven't managed to debug these cases yet but I do believe they are fixable.

tomchristie commented 2 years ago

This does surprise me, measurements I made show good connection setup time and throughput against most servers.

That's a positive snippet of info.

I should be more precise in what I mean to say here. I wouldn't assume that HTTP/3 for httpx would be necessarily be faster. It'll have different performance characteristics, which may be positive in some cases, and negative in others. That's certainly what I've seen with HTTP/1.1 -> HTTP/2.

crypt0miester commented 2 years ago

thanks team for your efforts.

may I ask if we should expect something? or shall I continue with aioquic?

tomchristie commented 2 years ago

You'll see updates here if/when someone starts working on integrating HTTP/3 support into httpx.

I expect it'll happen sometime. I'm up for supervising a pull request to get the support in, if someone wants to take it on.

zanieb commented 2 years ago

I'm interested in trying at moving this forward, but I'm still a beginner in these code bases.

The summary at https://github.com/encode/httpx/issues/275#issuecomment-531349234 seems the most thorough.

Am I correct in understanding that the next step is to detect Alt-Svc in responses and upgrade to HTTP 3? per

Add Alt-Svc support: always connect via TCP first, and only if Alt-Svc is received then establish a new H3Connection

It seems like you had been originally thinking that there should be support for a http3 boolean but it seems your thinking has aligned more with the above approach now?

tomchristie commented 2 years ago

It seems like you had been originally thinking that there should be support for a http3 boolean but it seems your thinking has aligned more with the above approach now?

Nope. We'd have an http3 boolean and only switch over if it's enabled, and if there's an Alt-Svc response.

However, I'd probably suggest starting this off from a different direction...

Start by testing out @jlaine's httpx client example. - https://github.com/aiortc/aioquic/blob/main/examples/httpx_client.py - is it still working and up to date? I could give some further guidance once we know that's all good and current.

zanieb commented 2 years ago

It was not working, but it is now in https://github.com/aiortc/aioquic/pull/333 or https://github.com/aiortc/aioquic/pull/314 β€” whether it's truly all good will take a more thorough suite of tests.

zanieb commented 2 years ago

https://github.com/aiortc/aioquic/pull/314 is approved and merged, so we have a working example to work from if you want to continue to provide guidance.

karpetrosyan commented 11 months ago

I have opened a pull request that adds HTTP3 support. If you are interested, I will appreciate any suggestions and reviews.

PR: https://github.com/encode/httpcore/pull/829