hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.4k stars 4.43k forks source link

Service router's "consistent protocols" requirement #11009

Open whiskeysierra opened 3 years ago

whiskeysierra commented 3 years ago

Overview of the Issue

Our goal is to setup a service router that is not backed by a real service (i.e. the default catch-all is not used). The UI calls those constructs "Routing Configuration" with a tooltip that describes exactly our intent:

This is not a registered Consul service. It’s a routing configuration that routes traffic to real services in Consul.

This works great if all route destinations (services) have the same protocol (set via their respective service defaults). Unfortunately it's not possible to have mixed protocols:

Error writing config entry service-router/x: Unexpected response code: 500 (rpc error making call: discovery chain "x" uses inconsistent protocols; service "default/b" has "grpc" which is not "http")

The documentation mentions some requirements, among them:

Service router config entries are restricted to only services that define their protocol as HTTP-based

https://www.consul.io/docs/connect/config-entries/service-router#interaction-with-other-config-entries

The way we understood this part is, that protocols like tcp are not supported. That leaves http, http2 and grpc. I'd consider them all to be HTTP-based and they all seem to be supported by service routers, as long as all route destinations have the very same protocol.

Either this is something that should be mentioned in the docs, because it's an important restriction. Or (obviously this would be our favorite) the implementation can be relaxed to allow mixed, HTTP-based protocols, assuming that there is no technical limitation preventing it. At least grpc and http2 should work together, shouldn't they?

https://github.com/hashicorp/consul/blob/bc0e4f2f46df874400fb49ea61776ea1ed254b8d/agent/consul/discoverychain/compile.go#L248-L255

Reproduction Steps

Write the following config entries:

{
  "Kind": "service-defaults",
  "Name": "x",
  "Protocol": "http"
}
{
  "Kind": "service-defaults",
  "Name": "a",
  "Protocol": "http"
}
{
  "Kind": "service-defaults",
  "Name": "b",
  "Protocol": "grpc"
}
{
  "Kind": "service-router",
  "Name": "x",
  "Routes": [
    {
      "Match": {
        "HTTP": {
          "PathPrefix": "/a"
        }
      },
      "Destination": {
        "Service": "a"
      }
    },
    {
      "Match": {
        "HTTP": {
          "PathPrefix": "/b"
        }
      },
      "Destination": {
        "Service": "b"
      }
    }
  ]
}

Consul info for both Client and Server

Client info ``` consul info agent: check_monitors = 0 check_ttls = 0 checks = 0 services = 0 build: prerelease = revision = e68f6c98 version = 1.10.2+ent consul: acl = enabled bootstrap = false known_datacenters = 1 leader = true leader_addr = 10.64.128.9:8300 server = true license: customer = 430bf013-9905-5574-9b5c-8432cd5f9498/11eb5fbb-964d-78f7-aa4c-0242ac11000e expiration_time = 2021-10-10 09:12:04.018628804 +0000 UTC features = Automated Backups, Automated Upgrades, Enhanced Read Scalability, Network Segments, Redundancy Zone, Advanced Network Federation, Namespaces, SSO, Audit Logging, Admin Partitions id = 5f138510-4bf7-4155-a3ae-c757f5600a8b install_id = * issue_time = 2021-09-02 09:12:04.018628804 +0000 UTC modules = Global Visibility, Routing and Scale, Governance and Policy product = consul start_time = 2021-09-02 09:12:04.018628804 +0000 UTC raft: applied_index = 4304577 commit_index = 4304577 fsm_pending = 0 last_contact = 0 last_log_index = 4304577 last_log_term = 44 last_snapshot_index = 4303218 last_snapshot_term = 44 latest_configuration = [{Suffrage:Voter ID:6a520230-7ffa-cdf8-9d17-2cead75a7144 Address:10.64.128.9:8300} {Suffrage:Voter ID:c21e12a4-7b6a-2f2c-0167-ed2a9299f72d Address:10.64.128.7:8300} {Suffrage:Voter ID:143a9e52-3bb5-cc60-06e4-925afbf923b1 Address:10.64.128.6:8300}] latest_configuration_index = 0 num_peers = 2 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Leader term = 44 runtime: arch = amd64 cpu_count = 2 goroutines = 2507 max_procs = 2 os = linux version = go1.16.7 serf_lan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 85 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 194940 members = 41 query_queue = 0 query_time = 1 serf_wan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 113487 members = 3 query_queue = 0 query_time = 1 ```
Server info We're running HCS , so I can't execute something on the server. Currently on Consul version 1.10.2.
jkirschner-hashicorp commented 3 years ago

Hi @whiskeysierra,

Thanks for the detailed and clear description! I agree it makes sense to document the current restrictions on the docs page you linked.

I have a few follow-up questions to make sure we understand the full context. (@harti2006 and @samowski: I welcome your thoughts on these questions as well!)

Q1: What was your understanding of the problem from the error message? rpc error making call: discovery chain "x" uses inconsistent protocols; service "default/b" has "grpc" which is not "http"

Perhaps there's an opportunity to make this error message clearer.

Q2: Can you help me better understand your use case for using one service-router that maps to destination services with different HTTP-based protocols? Since that isn't possible right now, what are you doing instead?

whiskeysierra commented 3 years ago

What was your understanding of the problem from the error message?

The error message was fine; it uses clear wording like inconsistent protocols and names the offending service and names the mismatching protocols. What I'm missing is the reason why inconsistent protocols are not supported. I wouldn't expect the error message to describe this in detail, but a short link to the docs (pointing to a section about limitations maybe) would be nice.

Can you help me better understand your use case for using one service-router that maps to destination services with different HTTP-based protocols?

We want to treat services as an implementation detail. Services in our setup are logically grouped into an application and those applications should be first-class citizens in our service mesh. Downstream services should use applications as upstream dependencies. Service routers would allow us to have those synthentic services and they would give us the power to quickly change routing from one service to another without downstream services noticing that they speak with a different one.

Unfortunately, we don't have homogeneous protocols across services of a single application. Most of them use grpc, but not exclusively.

What's the technical limitation for this? We're using ingress-nginx as our ingress controller at the edge with a very similar setup (backed by Ingress resources instead of ServiceRouter, but similar nonetheless).