swimos / swim

Full stack application platform for building stateful microservices, streaming APIs, and real-time UIs
https://www.swimos.org
Apache License 2.0
488 stars 39 forks source link

HTTP Downlinks #8

Closed c9r closed 1 year ago

c9r commented 5 years ago

HTTP downlinks simplify the common pattern of polling HTTP resources and treating the sequence of responses as a stream of events. When used to update lanes, HTTP downlinks have the effect of turning stateless REST APIs into multiplexed streaming WARP APIs.

Background

This is important because the load induced by polling is proportional to the rate at which you poll, whereas the load on streaming APIs is proportional to how often the data actually changes. In the limit, this factor makes streaming APIs infinitely more efficient than polled APIs for real-time data, because you would have to poll infinitely fast in order to detect changes with arbitrarily low latency. Note that message brokers don't solve the general case of the problem either because they absorb backpressure signals, which inevitably leads to buffer bloat.

By transforming polled APIs into multiplexed streaming APIs, the cost of polling can be paid just once by the HTTP downlink. The data can then be cheaply and ubiquitously streamed using much more efficient WARP APIs. Of course, the latency of the WARP APIs will still be limited by the polling rate of the HTTP downlink. Nonetheless, HTTP downlinks provide a critical stepping stone for making more data available as multiplexed streaming APIs. The availability of these bridged APIs will hopefully help drive more data sources to expose native streaming APIs so that the cost and latency hit of polling can be entirely avoided.

Design

As the name implies, HTTP downlinks are intended to behave like WARP downlinks. Just as you can open a WARP downlink to a WARP lane and start receiving events, so too should you be able to open an HTTP downlink to an HTTP resource and start receiving responses. Unlike a WARP downlink, an HTTP downlink will have to be configured with the rate at which to poll. HTTP downlinks should poll automatically, and their polling rate should be modifiable at any time to support adaptive polling.

HTTP downlinks will build on the swim.http/swim.io.http libraries, which fully support HTTP pipelining and chunking. Chunked HTTP responses can be incrementally consumed, if desired, by passing a custom swim.codec.Decoder to the HTTP downlink. We could add a decodeChunks flag to decode each response chunk as a distinct event, which would facilitate compatibility with existing ad hoc HTTP response streams, like Twitter's streaming APIs. But chunked responses are more accurately modeled with custom HTTP entity decoders, since they are, in fact, HTTP entities; and swim.http already models HTTP entities as backpressure-regulated streams.

The HttpDownlink API will live in the swim.api.http package, alongside the HttpLane interface. Both HttpDownlink and HttpLane will share many of the same callback function interfaces.

The HTTP downlink runtime will reuse the HttpBinding and HttpContext runtime interfaces used by HTTP lanes. HttpBinding extends LinkBinding, and HttpContext extends LinkContext, enabling us to route HTTP links through the Swim Kernel just like WARP links.

Implementation

The biggest holdup for completing HTTP downlinks has been answering the question: "who should be responsible for doing the actual polling?" HTTP downlinks could just open their own HTTP connections, but this would be inefficient since it doesn't make use of any connection pooling. We could use per-agent connection pools, but HTTP downlinks should be useable outside of Web Agents.

We've solved this problem before for WARP links. We make a RemoteHost responsible for transporting WARP links over multiplexed network connections. Why not do the same for HTTP? It's a bit counter-intuitive at first to treat "remote" HTTP traffic specially, since unlike WARP, you'd expect HTTP traffic to always be remote. But doing so cleanly and flexibly solves a whole bunch of interrelated problems in one fell swoop.

Adding a RemoteHttpHost does a number of things. First, it gives us a distinct Swim Kernel execution cell to take responsibility for initiating HTTP requests. Second, it integrates HTTP connections into the Swim Kernel's host model, giving the Swim Kernel a more holistic picture of network connections. Third, it lets us generalize our host configuration and policy mechanisms—initially intended for WARP hosts—to HTTP hosts (and eventually other host types). And fourth, it inadvertently solves the seemingly unrelated problem of natively routing HTTP requests through fabrics.

Although we're going to treat HTTP hosts like WARP hosts, the actual RemoteHttpHostClient implementation will be rather different from RemoteWarpHost and RemoteWarpHostClient. For one, a WARP host manages a single connection, and doesn't particularly care whether that connection was opened by a client, or accepted by a server. Whereas HTTP client host needs to manage a connection pool to the remote endpoint. A hypothetical RemoteHttpHost, by contrast, would manage a single HTTP server connection. Combined with the fact that a RemoteHttpHostClient sends request and receives responses, while a RemoteHttpHost would receive requests and send responses, there's unlikely to be much commonality between RemoteHttpHostClient and a potential future RemoteHttpHost. We may eventually decide to implement a RemoteHttpHost to uniformly model server-side HTTP connections opened by swim.service.web. But we leave that task for another day.

Where possible, we will want to multicast HTTP downlink responses so that if multiple agents downlink to the same REST API, it will only be polled once. This can only happen if multiple HTTP downlinks wish to make compatible requests. If multiple compatible HTTP downlinks specify different polling intervals, the fastest interval should be chosen. The HTTP downlink implementation should follow the same model-view pattern used by WARP downlinks.

Tasks

Interfaces

Remoting

Plumbing

ajay-gov commented 1 year ago

Closing for now, will be addressed in swim5