hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.39k stars 4.43k forks source link

Consul ACLs best practices with Vault Consul Secret engine #5535

Open danlsgiga opened 5 years ago

danlsgiga commented 5 years ago

Related to https://github.com/hashicorp/consul/issues/3957

Feature Description

The way the Consul ACL is structured at the moment makes it hard to have recommended ACL Production best practices based on the Hashicorp Docs.

Use Case(s)

What we are trying to achieve is to automate Consul ACL Tokens generation and lifecycle using the Vault Consul Secret Engine along with Vault Agent Caching. Hashicorp recommendations are to give exact hostname match rules on write for node and agent to allow catalog and internal operations by the agent. session rules are also subject to benefit from this.

node "exact_match_hostname" { policy = "write" }
agent "exact_match_hostname" { policy = "write" }
session "exact_match_hostname" { policy = "write" }

Creating 1 policy per hostname (manually or via other automation tools) goes against the idea of policies in the sense of allowing them to be shared among tokens and reduce clutter and load with thousands of policies with 1-1 relationships with tokens.

Suggestion

  1. This could be enabled by flags in the config where the agent would allow agent and node writes to its own hostname without requiring explicit policies to do so. Ex:
    {
    "acl": {
    "default_internal_rule": "write"
    }
    }

or

  1. This could be done via templating variables or placeholders in the policy. Ex:
    node "{{ self }}" { policy = "write" }
    agent "{{ self }}" { policy = "write" }
    session "{{ self }}" { policy = "write" }

Either way would help people to follow the Production ACL hardening in a much easier way and would give the Vault Consul Secret engine capabilities to follow Consul's ACL best practices

pearkes commented 5 years ago

It is worth looking through the series of PRs referenced here: https://github.com/hashicorp/consul/pull/5514. There is some effort going into deriving tokens for workloads that need Consul tokens in a way that reduces the overhead you mentioned.

I actually think it is worth revisiting this issue following our upcoming releases that will include those features. It may provide some answers to the problems you're having (but could also result in a few more feature requests, which is fine).

danlsgiga commented 5 years ago

Cool, I’ll keep an eye on the mentioned PRs and see if those solve my challenges in the next releases. Thanks!

pbusko commented 5 years ago

@pearkes any updates regarding this feature? In my use case agent' hostname can't be determined in advance (ec2 instance ID), so possibility to use self would be really helpful.

blockmar commented 5 years ago

I second the suggestion of allowing configuration of internal rules using a config file.

{
  "acl": {
    "default_internal_rule": "write"
  }
}

And also the ability to allow "local/internal" requests without tokens but require token for outside clients. (To remove the need of a per server policy and unique token)

We (like pbusko above) also ec2 instances and I do not like the idea of having to create tokens for the local server in some script where time we spin up a new Consul server instance.

To allow this we need to provide the script with an already know token... a bit of a catch 22.

Starting a new server instance should not require any human interaction. And having to attach a script to an autoscaling group that fetches a pre-determined token from AWS Secrets Manager and then uses that to create a server token... becomes very complicated very fast.

blockmar commented 5 years ago

We also have the same token issue when spinning up new Vault instances since the Vault instance needs a client token to access the Consul storage. Sadly the easiest fix for that is to use another storage engine for Vault than Consul.

issacg commented 4 years ago

I have another issue that falls under the same umbrella of best practices vis a vis vault consul secret engine, is token TTLs with regards to services and checks.

I have a service which gets registered and re-registered by a third-party. This third party application is authorized by Vault to access the needed policies in consul. Vault's dynamic secret management works like a charm at first glance.

But according to the consul docs both checks and even normal anti-entropy requires that I either supply a long lived token or else set a default agent token in order to upkeep the service entries in the catalog. The problem is that using Vault, I can never guarantee a dynamically issued (consul) token's lifespan as it's tied to the process (or vault-agent proxy) that has issued it (and anyway, I don't want a long-term token or an agent default token - that's the whole point of having Vault issue them dynamically instead of hardcoding a token in some external secret storage).

danlsgiga commented 4 years ago

Folks... any traction on this?

blake commented 4 years ago

@danlsgiga Consul 1.8.1 introduced Node Identities which help address the problem you outlined in this issue.

You can create an agent token with consul acl token create -node-identity=<nodename>:<DC>. The pre-configured ACL policy template listed in the docs will be assigned to the token.

danlsgiga commented 4 years ago

Thanks @blake, I didn't know about this feature... so, basically, at the token creation you specify the -node-identity to the hostname of the node and that default "templated?" policy is applied with the exact nodename specified via -node-identity?

blake commented 4 years ago

@danlsgiga Yes, that is correct.

ghost commented 3 years ago

hello, @blake we are used consul secrets engine in vault for generates API tokens. How we can create Consult API token in vault for ACL policy template? thx