hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.96k stars 1.96k forks source link

connect.sidecar_service.proxy.upstreams needs more filters – e.g. Tags #9914

Open hynek opened 3 years ago

hynek commented 3 years ago

(sorry for not using the template but it doesn't seem relevant to a feature request; I'm on Nomad 1.0.2)

In the past days I've played around with Nomad's mesh service and while I've run in a bunch of limitations (I guess that's fair given how new that feature is), I've run into a game-over limitation that I'd like to share.

Use-Case

I have a job that consists of a frontend and "many" (~80) backends. Each backend is a separate group and exposes a service that is currently differentiated by its name and tags.

Concretely they're backends that connect to domain registries. Each backend connects to a different one, and most of them exist twice: one connecting to the production environment of the registry and the other to the OT&E version (basically test env). (it's technically two jobs, but they use the same docker image)

In front of that all is a frontend service that takes a payload, does some things with it and then relays that payload to the correct backend. Basically it resolves reg-proxy-backend.registry-{registry-name}.env-{env}.service.consul and proxies to the result.

The Problem

This looks impossible to achieve with Nomad's Mesh currently, because according to https://www.nomadproject.io/docs/job-specification/upstreams#upstreams-parameters, I can only "filter" by name and dc.

What I would love is to be able to stop doing this by hand and define something like this:

upstreams {
      destination_name = "registry-backend"
      tags = ["registry-denic", "env-prod"]
      local_bind_port = 2001
}

or at least

upstreams {
      destination_name = "registry-backend-denic"
      tags = ["env-prod"]
      local_bind_port = 2001
}

and so on. The current affordances would force me to have service names like registry-backend-denic-prod which is not only gross, but also a hug PITA when setting intentions (basically I'd have to set 80*2 intentions).

I there any chance a solution to this use-case could be added near-term or should I look at other mesh solutions?

P.S. the services are all in Python if that matters for the discussion.

tgross commented 3 years ago

This seems pretty sensible to me. @shoenig before I mark this as accepted, do you know if this a limitation from the Consul API side that we should push upstream? Or is this in Nomad's side of the problem?

apollo13 commented 3 years ago

The same would be nice for the ingress gateway & terminating gateway. But I think that all those are currently limited by consul itself (would be great if I were wrong though).

blake commented 3 years ago

@hynek You should be able to accomplish this using Consul's service-resolver config entries. For example:

# registry-backend-service-resolver.hcl
#
# Service resolver for registry-backend. Defines multiple subsets which filter instances based on tags.
Kind = "service-resolver"
Name = "registry-backend"
Subsets = {
  "registry-denic" = {
    Filter = "\"registry-denic\" in Service.Tags"
  }
  "registry-denic-prod" = {
    Filter = "\"registry-denic\" in Service.Tags and \"env-prod\" in Service.Tags"
  }
}
# denic-service-resolver.hcl
#
# Virtual service which redirects traffic to the registry-denic subset of the registry-backend service.
Kind = "service-resolver"
Name = "registry-backend-denic"
Redirect {
  Service = "registry-backend"
  ServiceSubset = "registry-denic"
}
# denic-prod-service-resolver.hcl
#
# Virtual service which redirects traffic to the registry-denic-prod subset of the registry-backend service.
Kind = "service-resolver"
Name = "registry-backend-denic-prod"
Redirect {
  Service = "registry-backend"
  ServiceSubset = "registry-denic-prod"
}

Write these configurations to Consul using consul config write. Afterward you should then be able to reference these virtual service upstreams in your job definition.

upstreams {
  destination_name = "registry-backend-denic"
  local_bind_port = 2001
}

Note that I haven't tested this act config, so there might be a few syntax issues. Overall this config should work for your use case.

hynek commented 3 years ago

Thank you for taking the time Blake! Can we agree though, that that's rather cumbersome for a IMHO practical problem? 😇

It also doesn't seem to ameliorate the problem of defining many services and many intents? It makes it actually worse since I'd have to an equal amount of consul configs.

danlsgiga commented 3 years ago

Would love to see this too... this will allow us to simplify services in Consul by having a unique service name and different tags for each one and being able to target upstreams based on tags in Nomad will be awesome!

Thanks for working on this issue!

ostkrok commented 2 years ago

We'd love to see this feature implemented, as it could help us solve some issues we're currently having. Do you have any plans on adding it?

apollo13 commented 2 years ago

I think this would at least be partially solved via #13143. Still not as flexible as requested in the initial post, but at least less work then the service resolver configuration suggested by Blake?

thnee commented 11 months ago

I believe another use case for this feature would be the ability to address specific nodes in a database cluster. As Blake noted in this discuss post, service-resolver can be used for that, but this means that we now have the configuration for the service spread out over multiple places. It would be a lot better if this cold be done together with the Nomad job.

I'm not sure what people typically do to deploy Consul config entries? I haven't really found any story on that in the docs.

Currently we use Terraform for that, but it's not great, since it's separated from the jobs, and the provider doesn't understand all the config entry schemas, so it produces unwanted diffs.

Nomad jobs can have lifecycle hooks, but I don't think they really meet the needs for this, if I understand correctly they only run remotely and on every task start and stop. If lifecycle hooks could run locally, and only on job deployments, then it would be more suitable.

I guess the only real solution now is to write a custom script around job deployments that also writes config entries to consul? Or am I missing something?

It would be really neat if there was first class support for config entries, since they can be used for a lot more stuff. For example, when deploying consul-api-gateway as a Nomad job, it would be good to be able to also deploy the related config entries together with it (kinds api-gateway and http-route). Perhaps this could be a candidate for a feature in Nomad or Nomad-pack?

seanamos commented 7 months ago

I guess the only real solution now is to write a custom script around job deployments that also writes config entries to consul? Or am I missing something?

@thnee You aren't missing something. That's basically what we are doing at the moment. We write config entries as part of the service deployment. It would be nice if these things could be done as part of job.

kochen commented 5 months ago

@hynek You should be able to accomplish this using Consul's service-resolver config entries. For example:

# registry-backend-service-resolver.hcl
#
# Service resolver for registry-backend. Defines multiple subsets which filter instances based on tags.
Kind = "service-resolver"
Name = "registry-backend"
Subsets = {
  "registry-denic" = {
    Filter = "\"registry-denic\" in Service.Tags"
  }
  "registry-denic-prod" = {
    Filter = "\"registry-denic\" in Service.Tags and \"env-prod\" in Service.Tags"
  }
}
# denic-service-resolver.hcl
#
# Virtual service which redirects traffic to the registry-denic subset of the registry-backend service.
Kind = "service-resolver"
Name = "registry-backend-denic"
Redirect {
  Service = "registry-backend"
  ServiceSubset = "registry-denic"
}
# denic-prod-service-resolver.hcl
#
# Virtual service which redirects traffic to the registry-denic-prod subset of the registry-backend service.
Kind = "service-resolver"
Name = "registry-backend-denic-prod"
Redirect {
  Service = "registry-backend"
  ServiceSubset = "registry-denic-prod"
}

Write these configurations to Consul using consul config write. Afterward you should then be able to reference these virtual service upstreams in your job definition.

upstreams {
  destination_name = "registry-backend-denic"
  local_bind_port = 2001
}

Note that I haven't tested this act config, so there might be a few syntax issues. Overall this config should work for your use case.

This solution works well for "self" (or manually) registered services. A challenge with such approach is when the service is even defined externally (i,e traefik, or the service registers itself in Consul), which means that writing the service configuration might (and most likely will) overwrite other configurations (ie, routing) that were "preconfigured".

On the other hand, this is simple(r):

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "my-service"
              filters {
                tag = ['my-tag']
              }
              local_bind_port  = 1234
            }
          }
        }
      }