memorysafety / river

This repository is the home of the River reverse proxy application, based on the pingora library from Cloudflare.
https://www.memorysafety.org/initiative/reverse-proxy/
Apache License 2.0
1.64k stars 95 forks source link

Design Notes: Multiplexing one Downstream with multiple Upstreams #52

Open jamesmunns opened 1 month ago

jamesmunns commented 1 month ago

A common use case for reverse proxies is to accept multiple "groups" of incoming connections.

This could be:

This arises as we may want to have something that feels like multiple services, but all served through a single set of :443 or :80 ports.

The current BasicProxy (nor static file service) is not flexible enough to handle these cases - we have a single set of "downstreams", and a single set of "upstreams" - with no way to multiplex between the two.

For example, NGINX allows multiple servers to listen to a single listening port. From this article:

server {
    listen 443 ssl;
    server_name wiki.example.com;
    ssl on;

    location / {
        proxy_pass http://server02.example.com:8090/;
    }
}

server {
    listen 443 ssl;
    server_name sickbeard.example.com;
    ssl on;

    location / {
        proxy_pass http://server01.example.com:8081/;
    }
}

Note that BOTH servers are listening to port 443, but each have their own upstreams (server02 and server01, with different ports).

jamesmunns commented 1 month ago

Expanding on these thoughts and trying to clarify what we might want:

Right now, a service has these properties:

From a first glance, it seems like we would like to have a "multiservice" instance with the following qualities:

There are two main phases where we could make this decision:

I'm inclined to focus on the request_filter stage:

  1. This is already where the file-serving hooks operate: pandora-web-server only acts in this phase, and serves the entire connection
  2. As this is the first phase, we could make the "selection" logic here, figuring out which "set" of resources to use for the remainder of the lifecycle

Implementation wise, I believe this would require definining a new kind of service that would contain a unified set of rules (maybe just domain + URI?) to select which "subservice" matches, and continue using that.

This probably requires some kind of dyn trait that describes subservices, and delegates to that. This will likely then require storing the subservice handle in the CTX field, and using that to dispatch any later requests appropriately.