envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.3k stars 4.7k forks source link

[UDP][Feature] UDP Content Token Routing #9514

Open markmandel opened 4 years ago

markmandel commented 4 years ago

(Please consider this document a sacrificial draft. Feedback corrections and comments are very much appreciated, and likely warranted as I am only newley experienced with Envoy)

Objective

Be able to preemptively route a UDP session to a specific upstream entry in the cluster, based on content available (i.e. a token) within the UDP packet.

This is specifically useful for stateful endpoints in a cluster, such as a Dedicated Game Server for multiplayer games (which is my primary expertise), or VOIP/SIP backends utilise (I believe).

For this reason, any sort of random/round robin type load balancing is not effective, as we need to be able to specifically send a session to a specific cluster upstream endpoint.

Background

Articles

Presentations

Requirements and scale

Requirements:

Use Cases

The specific use case that I want to cover is around Dedicated Game Servers for multiplayer games, but could potentially be applied to any sort of stateful system that uses a UDP stream as a communication protocol.

Concerns / Questions

Design ideas

This is a sacrificial draft for a potential configuration for the content token routing:

admin:
  access_log_path: /tmp/admin_access.log
  address:
    socket_address:
      protocol: TCP
      address: 127.0.0.1
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: UDP
          address: 127.0.0.1
          port_value: 7650
      listener_filters:
        # our new type of udp router
        name: envoy.filters.udp_listener.udp_router
        typed_config:
          '@type': type.googleapis.com/envoy.config.filter.udp.udp_router.v2alpha.UdpRouterConfig
          stat_prefix: service
          cluster: gameservers_cluster
  clusters:
    - name: gameservers_cluster
      connect_timeout: 0.25s
      type: STATIC
      # since our listener filter provides the routing
      lb_policy: CLUSTER_PROVIDED
      load_assignment:
        cluster_name: gameservers_cluster
        endpoints:
        # three potential game servers to connect to on localhost
        # but different ports.
        - lb_endpoints:
            - endpoint:
                metadata:
                  # client tokens are stored in the metadata, as struct key values
                  # When `true`, the token has access, when false or non existent, access is denied.
                  "envoy.config.filter.udp.udp_router.v2alpha.UdpRouterConfig/tokens":
                    x7zs9: true
                    18z9y: true
                    j9zwk: true
                address:
                  socket_address:
                    address: 127.0.0.1
                    port_value: 26000
        - lb_endpoints:
            - endpoint:
                metadata:
                  "envoy.config.filter.udp.udp_router.v2alpha.UdpRouterConfig/tokens":
                    97zx9: true
                    18zyy: false # this client-token no longer has access
                address:
                  socket_address:
                    address: 127.0.0.1
                    port_value: 26001
        - lb_endpoints:
            - endpoint:
                "envoy.config.filter.udp.udp_router.v2alpha.UdpRouterConfig/tokens":
                  97ix0: true
                  16zyy: true
                  p6z9y: true
                  f6z3y: true
                address:
                  socket_address:
                    address: 127.0.0.1
                    port_value: 26002

Concerns / Questions

Alternatives considered

luna-duclos commented 4 years ago

One thing I'd like to see here is to be able to have a sort of fallback routing for when no specific token is configured.

luna-duclos commented 4 years ago

I'd also like to add the explicit consideration that tokens could be any length and envoy shouldn't be opinionated on that.

mattklein123 commented 4 years ago

Thanks for raising this @markmandel. This is actually a more general case of what needs to be done for https://github.com/envoyproxy/envoy/issues/1193 in which we need to route UDP packets based on the QUIC connection ID. I have some thoughts on how we can approach this and will reply back when I have some more time. cc @danzh2010

markmandel commented 4 years ago

@mattklein123 glad to hear it has a more general application than the use cases I am thinking of as well.

I didn't think to look at how QUIC implements this! :man_facepalming: There is so much good prior art there for a variety of use cases (sessions, crypto, etc).

https://quicwg.org/base-drafts/draft-ietf-quic-transport.html#name-connections (for this also subscribed who want to read up)

beriberikix commented 4 years ago

As another data point, IoT protocols also rely on tokens for routing - and other things, like request/response matching, caching and congestion control. CoAP (rfc7252) is based on UDP and one I'm particularly interested in seeing work with Envoy. The way CoAP uses tokens is slightly different than the way Mark/QUIC is describing (it's more a request ID) but hopefully helpful in thinking about a generalized solution.

chadr123 commented 3 years ago

I would like to support hash policy in udp proxy.

The udp proxy does not support hash based lb algorithms perfectly because it does not provide LoadBalancerContext when choose a host. So, the udp proxy with hash based lb algorithms will select a host by random manner.

I have investigated the tcp case and I found that it has the hash policy option. So, I think that we can support it in udp case as well simply.

This does not depend on the incoming packet's content.

Here is the my draft version of implementation : chadr123@d95c3f5

Please give your opinions for my idea. Thanks!!

mattklein123 commented 3 years ago

@chadr123 can you open a PR where we can discuss? I want to make sure we built the API in a way that will allow for byte range hashing. I think this can just be a wrapper message with a oneof inside of it that initially just has the general hash policy, and then later we can add byte range hashing on the datagram. Thank you!

chadr123 commented 3 years ago

@chadr123 can you open a PR where we can discuss? I want to make sure we built the API in a way that will allow for byte range hashing. I think this can just be a wrapper message with a oneof inside of it that initially just has the general hash policy, and then later we can add byte range hashing on the datagram. Thank you!

Ok. I will open a PR soon. :)

ggreenway commented 3 years ago

In addition to the work that @chadr123 is planning to do, if that is combined with a filter similar to header-to-metadata from http and #12594, token-based routing will work.

ronaldfenner commented 2 years ago

I'd like to suggest an additional way for at least the games use case.

Where @markmandel has the client token as part of the UDP packet one could include a server token as well. I know he expressed in his videos covering this that he didn't like the idea but i think that was only to just a server token.

I would suggest the client auth token and the server token. The server token is what's used to route to the upstream server while the client auth token is used to allow or drop the packet.

The client token could be manually added to the config or another way would be to offload the authorization to an external auth server to handle it. If the external service approves the client token its mapped with its network tuple such that further packets with that same tuple and client token would be passed through with out sending out for the auth call. There could be a TTL on the token so that after x amount of seconds the auth service is called again to see if it's still valid. Also if the network tuple doesn't match the stored tuple for the client token then the client token would be reauthorized.

In the event that the reauthorized client token failed on network allow an option to keep current mapped token, drop mapped token or reauthorize the mapped token with the attached network tuple.

This implementation alleviates some of his concerns about number of tokens/session per endpoint in that the server token wouldn't change that often and be at a lower count than one per client session. With sessions having a TTL for the token it would cut down on the rate of them being added and removed as a typical game session usually last for a bit and by not having to authorize every packet it would cut down on the lag introduced by calling the auth service.

Also it would be nice if there was a way to call an API for the listener to remove/update the token info. Example user logs out of their game and the service handling the logout could call the to the listener to remove the client token.

I also don't see in his proposed config on how to specify where the token would be found in the UDP packet. It shold probably be part of the typed config with my proposed addition you'd could have client_token: 0:3 server_token:4:6 By giving a byte rage of where to extract the data then the filter can just use the values to extract the tokens. Could even support negative ranges that would start from the end of the packet instead.