Closed mtougeron closed 4 years ago
Does this mean a single node exists in the dev and qa clusters at the same time? I am rather confused as to the use case.
+1. I posted yesterday to the consul mailing list about this.
The drawback of running multiple clusters of consul servers, is if an agent wants access to multiple clusters you'd also have to run multiple agents. That could get messy. For example, Bamboo is our build server. It's "production". A QA host might need to access production Bamboo but QA everything else. There are other services that are shared amongst multiple environments, so a "hard scope" doesn't work for us.
This domain idea is solid. What I'd like to see is another layer between service name and datacenter name.
Instead of:
foobar-mysql.service.chicago.consul
I would love:
foobar-mysql.service.qa.chicago.consul
The environment could be optional, returning results from service-providers with the same environment by default (just how datacenter works).
I don't want to use tags either because it breaks down when tags mean different things. For example, QA and Production might both have MySQL masters and slaves. I can't do:
master.qa.foobar-mysql.service.consul
Because multiple tags aren't supported in DNS queries (maybe that's one way to solve this problem, though). But I think we would like things a little more error-proof on the client side. Currently if QA and Production both publish the service foobar-mysql
and a client uses foobar-mysql.service.consul
they would get results across both environments. That's what we want to avoid.
Just realized I should correct one thing above. We probably wouldn't have the "multiple agents need to run" problem for agents that need to access multiple environments. They would use their own datacenter+environment by default and could be explicit when they need something else.
So datacenter value could work if 1) people are willing to run multiple server clusters (not ideal) or 2) if Consul supported multiple datacenter values in a cluster. These options apply if the layer between service name and datacenter described above isn't desired.
It seems in general people want the namespacing that data centers provide, but without the need to run multiple clusters per environment (prod, qa, stage). I think domains are redundant to data centers to solve this, since they are both namespacing mechanisms.
Instead, I think multi-tenancy of environments is wanted. I'll think on this, but it is a challenging problem. For now, the simplest approach is to run multiple instances of Consul (1 per environment) on the same physical hardware.
the simplest approach is to run multiple instances of Consul (1 per environment) on the same physical hardware.
Agreed. We have 13 environments. We add and remove them fairly frequently, usually due to a team branching and needing their own dev or qa environment. So you can see that this may be simple, but doesn't scale all that well.
Does this mean a single node exists in the dev and qa clusters at the same time?
No. It means that there is a node in a dev cluster & another node in a qa cluster. We want the app code to just look for "foo.service.internal" and not have to worry about which cluster it is connected to.
It seems in general people want the namespacing that data centers provide, but without the need to run multiple clusters per environment
Yes
the simplest approach is to run multiple instances of Consul (1 per environment) on the same physical hardware.
This is actually quite difficult to do with Chef in a way that provides a good HA solution. Also, if an instance in AWS goes down we lose 1/3 of servers across all environments instead of just the one. :(
Got it. A better solution to multi-tenancy seems to be the consensus. I agree whole heartedly. I'll start thinking about how to support this nicely.
Unfortunately, even with multi-tenancy, losing a server would affect all the environments. No way to solve that if the hardware is shared.
I'll start thinking about how to support this nicely.
You rock!
Unfortunately, even with multi-tenancy, losing a server would affect all the environments.
Yup. But at least it would only be for cross-dc queries to the dc that lost a node. :(
multi-tenancy would be great! I just planned to run multiple clusters, but if consul supports that out of the box, it would save a lot of headaches (for us :smile: ).
This would help us a lot as well. We also have a number of distinct environments / domains within the same datacenter that are firewalled off from each other; setting up many consul server clusters is our current approach but it's not ideal for many reasons.
I'm looking at the use case for larger enterprises, where multiple teams contribute to an eco-system of services. Individual teams within the company are considered 'tenants' and should be able to contribute and manage services they own, without impacting others. One thought is to use hierarchies of consul - where tenants use a private consul cluster for project internal orchestration. This approach will still require a common, multi-tenant top-level consul cluster to discover 'published' services from other teams. This all starts with the question - is consul a good fit for these types of use cases?
@tdeckers Typically, you want to treat Consul as a shared platform within a large organization. Teams individual manage and expose their services, but they do it on a shared cluster. Having tons of independent Consul clusters will quickly become a burden to operators, as there is so many more Consul servers to reason about. The goal is to get the ACLs to the point where they are sufficiently advanced enough to enable even the most complex multi-tenant use cases.
+1
We have a use case here where we would really need some sort of enforceable name spaces for services. So mechanism to restrict a particular agent to manage services only within a defined name space would be greatly appreciated.
+100 time, much appreciated!
Hi Is this still Open? or This can be done by today's ACL support on consul? if so I would like to know some example configuration.
Hi @nati this is still in work. ACLs are getting much richer for the upcoming 0.6 release. You can see the upcoming documentation here - https://github.com/hashicorp/consul/blob/master/website/source/docs/internals/acl.html.markdown.
@slackpad that sounds great! I was going to try and hack it by creating a namespace/directory for each tenant and then restricting accordingly. But the benefits of preventing leaking via service registration and DNS have convinced me I should hold off.
If there's some way to help with the development let me know. I'd like to try and do more than add another +1 :smile:
The ACL updates are welcomed but we lose the possibility of discovering many environments for example on demand.
@scalp42 I'm not sure I follow. The ACLs should be rich enough for you to provide read-only access to services if so desired. Do you have a specific use case in mind that we wouldn't hit?
I'm not sure how ACL will help with DNS-based discovery that doesn't use SRV records (because, say, service ports are known) but just vanilla name resolution. There is a distinction between ACL management of service registration for different tenants and lookup. What if I want some kind of domain name convention such that I can support search
in /etc/resolv.conf
? This is where multiple subdomains within the same Consul cluster comes in handy. For example, I have this /etc/resolv.conf
nameserver 127.0.0.1
search mine.services.consul shared.services.consul
I'm looking to connect to service foo
which happens to not be running under my domain, it'd just fall back onto the shared domain. This lookup part has nothing to do with tenancy. Rather, it's about logical grouping. I don't really need ACL on the lookup.
So considering that this case has been open for over a year, am I to assume that from a service discovery perspective, that native support for multiple environments will not be happening? I'm with @dvusboy that I'm not clear how ACL's help with DNS-based discovery. I guess I could use tags, but I'd have to wrap my head around how we would deal with that given our current environment. Plus using tags would likely be more prone to errors as not all of our applications are completely 'environment' aware.
I will likely have to go the multiple consul clusters route to prevent that from happening, which is something I was hoping to avoid.
+1 for adding a segregation layer between the DC and the servicename, so we can run multiple environments on a single big consul cluster.
We are currently using an setup with a similar breakdown, and we were wondering how to address this with consul.
We have high level logical groupings of servers we call domains (infrastructure, development, opsapp, testapp, sandboxapp) and we can create 'subdomains' when we stand up our own stack (developer initials, branch name via ci, etc)
We currently manage a host file with puppet (terrible solution) and we can access instances like 'servicename.daw.sandbox.xyz' where .xyz is added to our no proxy rules.
We have the ability on an instance to access nexus.development.xyz to access the common development nexus server, but we can access nexus.daw.development.xyz if we are testing changes to our development nodes, or nfs.daw.infrastructure.xyz if we are testing changes to our nfs server, etc.
We may end up constructing our service names like NAS-daw-inf.service.consul if we can't find a reasonable alternative.
TL;DR, +1 for this feature, or a way to use this feature without running multiple consul clusters
Linking https://groups.google.com/d/msgid/consul-tool/c7a5ff91-ff4c-4ea7-a4d1-af5d7c5e8a72%40googlegroups.com?utm_medium=email&utm_source=footer here which has some implementation ideas around tags.
What is the latest status of that feature? We do have the same requirement, which comes down to use one central clusters and creating namespaces, which allow's us to control that one service can only change within their namespace while reading others is okay.
Hi @rkno82 this is still a ways out though it is on our roadmap. There are some architecture implications to think through, but we know that a lot of folks are interested in this capability.
Still on this topic.. ACL's are now pretty elaborate. However, in a multi-tenant environment I want to make sure tenants service names aren't overlapping. I might have two project teams (two tenants for my infra) that are creating a 'web' component as part of their service. Both might want to create web.service.consul. I'd be looking for a way to make them register web.tenant1.service.consul and web.tenant2.service.consul. EDIT: i'd put ACL's on *.tenantx.service.consul so that only respective tenants can update.
On top of these private, intra-application (or intra-tenant) service, I'll have these tenants publish public services (within our enterprise): useful1.service.consul and useful2.service.com. Anyone running into this situation? Any ideas how to solve this problem with current consul? New features needed?
Is this still on the roadmap? If so, is multi-tenant support still a ways out or is this something coming soon at this point?
Just ran into this issue today. Any updates? Does anyone have a workaround to this? I dont want to have to provision a bunch of Consul servers to separate each environment.
Also ran into this recently. Any updates on this issue? It's been over 2 years since this issue was first opened.
Hi consul experts! I am new to consul and I have ran in to this very same issue! I don't want to use tags as they are implemented, because of the evil it could cause by accidentally mistaking one unimportant environment with another very important environment.
However, I was thinking this could be solved using tags ... if one could specify on a service definition something such as,
discovery:"tags-only"
where in this service discovery mode you could restrict matching to available services having a matching tag; and that's it.
e.g. given a redis service ...
redis.service.consul. -> would never provide matches
but
dev.redis.service.consul --> would provide a match.
It's quite possible I am trivializing it... but if this feature were there; it would work for me. I will offer to create a patch and contribute if consul folks think it's a doable thing...
TIA, George
Looks like on datacenter if we have 3 domains dev, qa prod. Consul server can join to each other domain which is not correct. IN datacenter team can form multiple domain. Domain can restrict to each other.
IMO I dont know whether this should remain on the roadmap at all, because of the reservations already expressed. Is not the complexity due to a mismatch between hierarchical (domains) verses flat (single service) namespace? Whatever solution tries to "address" the issue, will just create more complexity and edge cases.
The current scheme - of dropping the datacenter part from a query - is already an optimisation around a hierarchical namespace, being flattened. Any implementation by including dev, qa, prod, etc, in the same vein, will inevitably be more complex, as I understand.
A workaround which I have found, is to follow a convention of only using fully qualified form when accessing shared services, rather than expecting the shared service namespace to be flattened. So my dev, test, qa nodes and services follow this:
[dev] myapplication.node.mydomain [qa] myapplication.node.mydomain [prod] myapplication.node.mydomain
[dev, shared repository] repo.node.shared.mydomain [qa, shared repository] repo.node.shared.mydomain
@damobrisbane I'm not opposed to closing this. I think things have progressed enough from 2014 to make this unnecessary.
@mtougeron are you able to expand a little or provide some link on how the product has evolved to meet the scenario that prompted this. Does it depend on enterprise version, specific design patterns or anything else when supporting "..multiple domains per datacenter".. Cheers
@mtougeron and @armon, I'm with @damobrisbane.
@mtougeron doesn't provide any additional information on how this problem can be solved with the application as it stands today.
In our use case, we maintain approximately 150 environments, wherein each is a near replica of the rest. Maintaining 150 consul clusters is not feasible. Namespacing by environment, provides the resource isolation we'd need.
resolv.conf
on each node in an environment (eg. environment = env100):
search env100.dc1.consul dc1.consul consul company.com
nameserver 192.168.0.1
This also greatly simplifies end-user access as there is consistency in accessing services and servers (all servers and services for an environment are easily determinable by a human and match from environment to environment).
I am happy to let you know this issue is solved, now that namespaces arrived in Consul 1.7.0 Enterprise: https://github.com/hashicorp/consul/blob/master/CHANGELOG.md#170-february-11-2020.
It would be extremely helpful for us if consul supported multiple domains per datacenter. This would help us be able to segment the clients connected to the cluster while still considering them part of a single datacenter.
For example, on the server(s) it would support something like
{ "datacenter": "us-west-1", "domain": ["dev.internal", "qa.internal"] }
If ClientA had
{ "datacenter": "us-west-1", "domain": "dev.internal", "service": { "name": "foo" } }
and ClientB had{ "datacenter": "us-west-1", "domain": "qa.internal", "service": { "name": "foo" } }
foo.service.us-west-1.dev.internal would resolve to ClientA foo.service.us-west-1.qa.internal would resolve to ClientB
Or perhaps support something similar via tags?
Basically we want to avoid running multiple clusters of consul servers for each "environment" in a datacenter. We also want to avoid having to add the environment name to the service names. If we did dev.foo.service.us-west-1.internal (using a 'dev' tag) it creates a higher chance of error that the app code sets the wrong environment.
p.s., this may be similar to https://github.com/hashicorp/consul/issues/208 but it seemed different enough that I opened another ticket.