python-web-sig / wsgi-ng

Working group for wsgi-ng
39 stars 3 forks source link

Upgrading sessions (Websockets, BOSH, Icecast2 ...) #2

Closed unbit closed 9 years ago

unbit commented 9 years ago

A bunch of new standards popped-up in the last years requiring the HTTP session to be 'Upgraded' to transport non-HTTP related data. As an example websockets use the TCP connection opened by the HTTP request to transport raw-framed data.

In the same way, Icecast2 will receive streamed audio/video content from the client during a SOURCE/PUT request.

The various WSGI implementatations already have various solutions in the form of middlewares (like the ones available for gunicorn and the gevent wsgi server) or directly embedded in the server and exposed as api for the app (like in uWSGI). Both the approaches are server-specific so this is not good in terms of standardization.

I am against implementing a WSGI standard for each new protocol, so my proposal is simply exposing the low-level socket like PSGI (the WSGI of the perl world) do with the psgix.io extension (https://github.com/plack/psgi-specs/blob/master/PSGI/Extensions.pod). In this way middlewares or apps can use their implementation independently by the server. (Obviously servers will continue exposing high-performance api, but higher-level implementation can be developed in server-independent ways)

Eventually, extensions to the standard can be proposed for each protocol, but the core should not take in account them.

defnull commented 9 years ago

I am against exposing the low level socket for HTTP/2 connections. If we keep the abstraction level of WSGI, an application is called once for each request. At the time a WSGI/2 application is called, we already have an established HTTP/2 stream. Upgrading protocols is no longer an option (see 8.1.1).

unbit commented 9 years ago

We can 'suggest' to not expose the socket for HTTP/2 streams, or that if you decide to use it you are on your own (effectively you are on your own whenever you start messing at a lower level). But IMHO we need a way for HTTP/1.x to upgrade in a "standard" way.

rbtcollins commented 9 years ago

There are a few things I think Upgrade could possibly mean in the context of WSGI.

It might mean

a) "servers have to be able to serve multiple protocols on one port without violating the spec."

or b) "middleware and/or applications need to be able to implement protocols themselves"

or c) middleware/apps need to be able to be called for any 'request'[*] regardless of protocol - (but they need to make their response be meaningful in that protocol)'

or d) apps should be able to determine the response protocol used themselves

and these things are orthogonal I think - a& b would be legitimate, as would a & b & cm and a& b & c & d.

But - which of these things do we think are both necessary and not an unreasonable burden on containers such as mod_wsgi?

Implementing arbitrary protocols by granting a handle to the socket itself seems like a risky move in terms of the impact on containers.

Here's my take: A) seems like a necessary thing - its already in the discussion on the list. B) seems like something that would be great for future proofing, but potentially very risky in impact on containers [subject to feedback from a container maintainer]; I wouldn't want to make it mandatory, nor have any of the core HTTP/1.0/1.x/2.0 or websockets protocols implemented via it. C) Seems like a necessary thing - otherwise it wouldn't be possible to build e.g. 'routes' for websockets D) doesn't seem necessary to me - and in fact it doesn't fit terribly well in HTTP server side since the client is what requests upgrade; protocol negotiation has taken place long before current codepaths hit WSGI - D would imply B I think.

So I propose that we make A and C design points: servers can server multiple protocols , apps need to be able to tell what the protocol is (perhaps in a better way than SERVER_PROTOCOL - or perhaps we say that SERVER_PROTOCOL is sufficient), and we need the behaviour of the response from the app to be able to be dependent on the protocol type. E.g. what we get back in response to a websockets app call vs a HTTP/1.{0.1} vs a HTTP/2 might differ.

What do you think?

rbtcollins commented 9 years ago

Working through my suggestion with respect to the specific protocols given: websockets

icecast2

HTTP/2

unbit commented 9 years ago

So, basically WSGI-ng should not require any particular low-level access for HTTP/1.1|2 protocols. It seems a reasonable choice, but we still need it for websockets (unless we define a standard for them) and all the funny protocols that will popup in the future.

rbtcollins commented 9 years ago

I'm proposing we define a standard for websockets.

unbit commented 9 years ago

Ok, but take in account that websockets usage is often highly optimized (like zero-copy, buffers re-use and so on). uWSGI as an example offers a pretty big api http://uwsgi-docs.readthedocs.org/en/latest/WebSockets.html

Ending with handshake + send + recv could be limiting in some scenario like gaming

rbtcollins commented 9 years ago

I think you mean 'the API we offer can't be just handshake/send/recv, because high performance'. More specifically the page you linked has uwsgi.websocket_handshake([key, origin, proto]) uwsgi.websocket_recv() uwsgi.websocket_recv_nb() uwsgi.websocket_send(msg) uwsgi.websocket_send_binary(msg) uwsgi.websocket_send_from_sharedarea(id, pos) uwsgi.websocket_send_binary_from_sharedarea(id, pos)

Which can be written a few different ways - e.g.: uwsgi.websocket_handshake([key, origin, proto]) uwsgi.websocket_recv(timeout=0) uwsgi.websocket_send(msg, binary=False) uwsgi.websocket_send_from_sharedarea(id, pos, binary=False)

Either way AFAICT its missing a close() primitive, and the send_from_sharedarea is the zerocopy implementation?

Did you consider using the buffer protocol (https://docs.python.org/3/c-api/buffer.html#bufferobjects) - supported back to 2.6 / 2.7 ? With that if msg supports the buffer protocol zerocopy can take place transparently, and we could consider supporting the readinto method to provide zerocopy reads too.

unbit commented 9 years ago

It would be a dream if this new WSGI thing can support the buffer protocol. uWSGI already support it enabling a specific flag, and this feature caused a lot of issue to some third-party middlewares, but (on the other side) simplified a lot of scenarios (that is why users asked for it). The close() primitive is missing as the connection is implicitely closed when the application callable returns.

rbtcollins commented 9 years ago

According to a recent python-dev thread (http://code.activestate.com/lists/python-dev/132839/) there's some ground work needed to enable this. In the interests of not boiling the ocean, I suggest we exclude that work from this effort - but lets point interested folk at that work, and if/when its done we can iterate to include it (perhaps as a standardised extension).

rbtcollins commented 9 years ago

re: the close primitive: the application callable returning isn't sufficient to free resources in WSGI in general - see PEP-342. Or perhaps I'm not understanding something about the specific way this code is thunked into.

Coming back to the issue at hand, we need draft language in the repository to cover websockets. BOSH too is packetised but differently (it encodes in html, and its unit of transport is a 'message'). Unlike websockets, we can expect multiple front end HTTP requests which need to be routed to the same WSGI app. So that implies a bunch of separate issues. To keep things from becoming buried in long history, I'm going to close this issue, and open:

Icecast as already discussed is a no-brainer.

rbtcollins commented 9 years ago

Closing in favour of #10, #14, #15

GrahamDumpleton commented 9 years ago

In uWSGI, rather than:

uwsgi.websocket_send_binary(msg)
uwsgi.websocket_send_from_sharedarea(id, pos)
uwsgi.websocket_send_binary_from_sharedarea(id, pos)

you would have been better off with a single generic send() function:

uwsgi.websocket_send(msg)

If msg isn't a an instance of bytes, then it would need to provide a __bytes__ or __str__ method to return the object as bytes for sending.

You could then allow different types such as a buffer to be passed in, or even implementation specific types similar to worked wsgi.file_wrapper which would be supplied by the specific implementation, with the implementation detecting different types and working out what to do when passed to send().

This is better than separate methods as you minimise the interface to one which can be common across implementations with the implementation making the decision about handle special types or fallback to basic mechanism of sending as bytes.