Open fitbeard opened 1 year ago
Another use case for Vault related to the secrets would be as a Barbican backend: https://docs.openstack.org/barbican/latest/install/barbican-backend.html#vault-plugin
So right now what make Vault a weird case is it's under Business Source License 1.1 (https://github.com/hashicorp/vault/blob/main/LICENSE#L20) so might not suit production uses and a bit strange to add to Atmosphere.
I spoke with Hashicorp and they are OK with us using Terraform in Atmosphere, but I will need to bring up Vault.
Hi. I made huge progress on this issue: https://github.com/fitbeard/atmosphere/compare/vault_poc_pre_timestamp...vault_poc. Before pushing I need to figure out how to make changes in helm-toolkit and OS charts to add custom annotations to secrets and how to skip values like region, username, password for service and MQ/DB connection strings.
Here is example which can be used for init/db-sync-like containers:
apiVersion: apps/v1
kind: Deployment
metadata:
name: tadas-test
namespace: openstack
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: vault
template:
metadata:
labels:
app.kubernetes.io/name: vault
annotations:
vault.security.banzaicloud.io/vault-addr: "https://vault.vault:8200"
vault.security.banzaicloud.io/vault-role: "vault"
vault.security.banzaicloud.io/vault-tls-secret: vault-tls
vault.security.banzaicloud.io/vault-path: "kubernetes"
spec:
serviceAccountName: default
containers:
- name: alpine
image: alpine
command: ["sh", "-c", "echo $AWS_SECRET_ACCESS_KEY - $CRED_SECRET && echo going to sleep... && sleep 180"]
env:
- name: AWS_SECRET_ACCESS_KEY
value: vault:secret/data/demosecret/aws#AWS_SECRET_ACCESS_KEY
- name: CRED_SECRET
value: vault:openstack/creds/vault#application_credential_secret
For this custom annotations are needed.
To utilize oslo.config
environment driver we need something like: https://bank-vaults.dev/docs/mutating-webhook/vault-agent-templating/ or https://bank-vaults.dev/docs/mutating-webhook/consul-template/ - a mechanism which can HUP proccess + re-read db-creds/os-app-creds from env TO use THIS: https://github.com/fitbeard/atmosphere/compare/vault_poc_pre_timestamp...vault_poc#diff-4c7dc20d92f312a2206ef00799d4cbbfa672faa124af97d882a80501b6f312f9R113
I am asking you to help me or at least advise me :)
Now I'm waiting for this change https://review.opendev.org/c/openstack/openstack-helm/+/916641 to be merged. We already have contributions from other contributors related to annotations for pods and jobs. This will unblock Vault integration tasks for a while.
So right now what make Vault a weird case is it's under Business Source License 1.1 (https://github.com/hashicorp/vault/blob/main/LICENSE#L20) so might not suit production uses and a bit strange to add to Atmosphere.
IANAL but I think that, to violate the license requirements, Atmosphere would have to ship with Vault and that would then have to be used by whoever is deploying Atmosphere to provide Vault services to customers and compete with things like Hashicorp Cloud's Vault. In the worse case, it would be up to operators to decide what they are going to do with it.
We may want to consider if integrating OpenBao instead would be okay.
@fitbeard Thanks on your progress here, so I think at this point we have a few components that can be managed directly by Vault secrets:
Now, I think deploying Vault, enabling those secret engines is the easy part. The difficult part is actually relying on those values inside Atmosphere. OpenStack services largely don't really support value being reloaded in runtime (or I don't think they'll actually do it very gracefully).
This leaves us with two choices:
With #2, we'd have to add a lot more resiliency, but the win that we would get out of the box is that means if we are confident at 'killing' pods, then we can start more effectively using autoscaling for stateless services (API services, etc)
I am worried about stateful ones like L3 agents, those can take a long time to spin back up if there is a lot of routers and can cause an actual interruption...
Curious about thoughts here?
@mnaser I will start testing granully and only with one OS service. Let it be Glance. First I'm planning to configure bootstrap/dbsync/init stuff (which are using secrets only during a moment of execution) and fill dynamic
data like RABBITMQ_CONNECTION
or DB_CONNECTION
with values from Vault using overrides: ${vault:rabbitmq-glance/creds/role#username}:${vault:rabbitmq-glance/creds/role#password} using https://bank-vaults.dev/docs/mutating-webhook/configuration/
About service reloading I still don't have answer but hoping that somehow this can be used: https://bank-vaults.dev/docs/mutating-webhook/vault-agent-templating/#use-vault-ttls It goes without saying that it needs improvement on Chart side.
And here is Vault agent template (with Ansible jinja compat) compatible with oslo.config environment driver for "sourcing":
{% raw %}{{- with secret "rabbitmq/creds/admin" -}}
{% endraw %}
OS_DEFAULT__TRANSPORT_URL="rabbit://{% raw %}{{ .Data.username }}:{{ .Data.password }}{% endraw %}@{{
rabbit_hosts | join(":5671," + "{{ .Data.username }}:{{ .Data.password }}" + "@") }}:5671"
{% raw %}{{ end }}
{% endraw %}
{% raw %}{{- with secret "openstack-test/creds/admin" -}}
{% endraw %}
OS_KEYSTONE_AUTHTOKEN__APPLICATION_CREDENTIAL_ID={% raw %}"{{ .Data.application_credential_id }}"
{% endraw %}
OS_KEYSTONE_AUTHTOKEN__APPLICATION_CREDENTIAL_SECRET={% raw %}"{{ .Data.application_credential_secret }}"
{% endraw %}
{% raw %}{{ end }}
{% endraw %}
{% raw %}{{- with secret "db/creds/admin" -}}
{% endraw %}
OS_DATABASE__CONNECTION="mysql+pymysql://{% raw %}{{ .Data.username }}:{{ .Data.password }}{% endraw %}@{{
db_host }}/glance?charset=utf8&ssl_ca=/etc/ssl/certs/{{ xxx }}/ca.pem"
{% raw %}{{- end -}}
{% endraw %}
Right now oslo.config has driver(enabled by default) which allows reading configuration from environment variables: https://opendev.org/openstack/oslo.config/commit/ea8a0f6a8b260474151fb27c2adc9dcc88774850 https://specs.openstack.org/openstack/oslo-specs/specs/rocky/config-from-environment.html https://docs.openstack.org/oslo.config/latest/reference/drivers.html The idea is to use HashiCorp Vault agent (sidecar) for all containers + https://github.com/vexxhost/vault-plugin-secrets-openstack + other Vault secret engines to rotate all secrets via Vault.