Open rdeb22 opened 4 years ago
Hi @rdeb22,
We don't include Consul Agent with the Vault deployment. Instead, consul-helm deploys Consul Agent as a daemonset so it runs on every worker node. Vault can access the Consul Agent running on the node using localhost: 127.0.0.1
.
@jasonodonnell Thanks for the information,
I did what you asked me to do, Installed helm-consul in client mode. But still seeing these errors in connection.
ui = true
listener "tcp" {
tls_disable = 1
address = "127.0.0.1:8200"
}
storage "consul" {
path = "vault/"
address = "127.0.0.1"
}
Getting Connection Refused
[root@qa4-ops2-k8s-master-20191021071333-1-1b 10.11.92.39:~] k logs vault-helm-0
WARNING! Unable to read storage migration status.
2020-04-29T19:42:25.923Z [INFO] proxy environment: http_proxy= https_proxy= no_proxy=
2020-04-29T19:42:25.923Z [WARN] storage migration check error: error="Get http://127.0.0.1/v1/kv/vault/core/migration: dial tcp 127.0.0.1:80: connect: connection refused"
Hi @rdeb22,
Your config needs a little tweaking (I mislead you in my last comment, needs to specify consul's port):
storage "consul" {
path = "vault/"
address = "127.0.0.1:8500"
}
Actually, this might need to be the host's IP address. Vault Helm will do this for you automatically if you configure it like this:
storage "consul" {
path = "vault/"
address = "HOST_IP:8500"
}
@jasonodonnell Thank you again and no problem but sadly this also did not work, see below.
Made the address = "HOST_IP:8500"
abd deployed.
Now ,
k get pods | grep vaul
vault-helm-0 0/1 Running 0 87s
vault-helm-1 0/1 Running 0 87s
vault-helm-2 0/1 Running 0 87s
Logs now shows "Unexpected response code: 500"
k logs vault-helm-0
2020-04-30T11:02:27.887Z [INFO] proxy environment: http_proxy= https_proxy= no_proxy=
2020-04-30T11:02:27.888Z [WARN] storage migration check error: error="Unexpected response code: 500"
WARNING! Unable to read storage migration status.
2020-04-30T11:02:29.889Z [WARN] storage migration check error: error="Unexpected response code: 500"
On Describing Pod I see
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m29s default-scheduler Successfully assigned spr-ops/vault-helm-0 to k8s-openebs-node-202003110542-2-1b
Normal Pulled 2m28s kubelet, k8s-openebs-node-202003110542-2-1b Container image "vault:1.3.2" already present on machine
Normal Created 2m28s kubelet, k8s-openebs-node-202003110542-2-1b Created container
Normal Started 2m28s kubelet, k8s-openebs-node-202003110542-2-1b Started container
Warning Unhealthy 78s (x22 over 2m21s) kubelet, k8s-openebs-node-202003110542-2-1b Readiness probe failed: Error checking seal status: Get http://127.0.0.1:8200/v1/sys/seal-status: dial tcp 127.0.0.1:8200: connect: connection refused
But,
k exec -it vault-helm-0 sh
/ $ export VAULT_ADDR=http://127.0.0.1:8200
/ $ vault status
Error checking seal status: Get http://127.0.0.1:8200/v1/sys/seal-status: dial tcp 127.0.0.1:8200: connect: connection refused
/ $ ps -ef | grep consul
5790 vault 0:00 grep consul
/ $ ps -ef | grep vault
1 vault 0:00 /bin/sh -ec sed -E "s/HOST_IP/${HOST_IP?}/g" /vault/config/extraconfig-from-values.hcl > /tmp/storageconfig.hcl; sed -Ei "s/POD_IP/${POD_IP?}/g" /tmp/storageconfig.hcl; /usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/storageconfig.hcl
9 vault 0:00 {docker-entrypoi} /usr/bin/dumb-init /bin/sh /usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/storageconfig.hcl
10 vault 0:00 vault server -config=/tmp/storageconfig.hcl
4158 vault 0:00 sh
5818 vault 0:00 ps -ef
5819 vault 0:00 sh
/ $ cat /tmp/storageconfig.hcl
disable_mlock = true
ui = true
listener "tcp" {
tls_disable = 1
address = "127.0.0.1:8200"
}
storage "consul" {
path = "vault/"
address = "10.11.04.18:8500"
}
#service_registration "kubernetes" {}
# Example configuration for using auto-unseal, using Google Cloud KMS. The
# GKMS keys must already exist, and the cluster must have a service account
# that is authorized to access GCP KMS.
#seal "gcpckms" {
# project = "vault-helm-dev-246514"
# region = "global"
# key_ring = "vault-helm-unseal-kr"
# crypto_key = "vault-helm-unseal-key"
#}/ $
Please help. Thanks
Hi @rdeb22, can you make the following changes to your config?
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
@jasonodonnell Sure, I did but same thing.
/ $ export VAULT_ADDR=http://127.0.0.1:8200
/ $ vault status
Error checking seal status: Get http://127.0.0.1:8200/v1/sys/seal-status: dial tcp 127.0.0.1:8200: connect: connection refused
/ $ cat /tmp/storageconfig.hcl
disable_mlock = true
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "consul" {
path = "vault/"
address = "10.11.12.42:8500"
}
#service_registration "kubernetes" {}
# Example configuration for using auto-unseal, using Google Cloud KMS. The
# GKMS keys must already exist, and the cluster must have a service account
# that is authorized to access GCP KMS.
#seal "gcpckms" {
# project = "vault-helm-dev-246514"
# region = "global"
# key_ring = "vault-helm-unseal-kr"
# crypto_key = "vault-helm-unseal-key"
Logs
k logs vault-helm-0
WARNING! Unable to read storage migration status.
2020-04-30T14:10:24.578Z [INFO] proxy environment: http_proxy= https_proxy= no_proxy=
2020-04-30T14:10:24.580Z [WARN] storage migration check error: error="Unexpected response code: 500"
Pods
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m31s default-scheduler Successfully assigned spr-ops/vault-helm-0 to k8s-openebs-node-202003110536-1-1b
Normal Pulled 5m30s kubelet, k8s-openebs-node-202003110536-1-1b Container image "vault:1.3.2" already present on machine
Normal Created 5m30s kubelet, k8s-openebs-node-202003110536-1-1b Created container
Normal Started 5m30s kubelet, k8s-openebs-node-202003110536-1-1b Started container
Warning Unhealthy 28s (x100 over 5m25s) kubelet, k8s-openebs-node-202003110536-1-1b Readiness probe failed: Error checking seal status: Get http://127.0.0.1:8200/v1/sys/seal-status: dial tcp 127.0.0.1:8200: connect: connection refused
I'm getting the same connection refused error as soon as I add in the consul backend to my vault helm deployment. I'll keep watch for a resolution. In my case this was working along with the seal stanza with a transit key however both of these configs have broken in the last week. I changed the CNI provider to calico but I'm not sure this is related...
I'm getting the same connection refused error as well :/
My fix was network related, I switched back to flannel and and used host-gw as the network type. I tested the issue by switch to a standalone config ti rule out any other config issues. This was all in a poc env so retaining data was not a concern.
I used same version chart and just change vault docker image 1.4.0 => 1.3.1 (only test 1.3.1, not sure others 1.3.X version) it look like solved issue
Hi, i encountered this connection error as well when trying to install vault HA mode with consul as the backend. I managed to get the consul server running but looks like vault having trouble connecting to consul server. These are the version that i use for both consul and vault: CHART: consul-0.21.0 APP_VERSION 1.7.3 CHART: vault-0.6.0 APP_VERSION 1.4.2
Guide that i use for installing: https://www.vaultproject.io/docs/platform/k8s/helm/run
vault-0 logs
WARNING! Unable to read storage migration status.
2020-06-14T21:50:27.745Z [INFO] proxy environment: http_proxy= https_proxy= no_proxy=
2020-06-14T21:50:27.746Z [WARN] storage migration check error: error="Get http://192.168.1.210:8500/v1/kv/vault/core/migration: dial tcp 192.168.1.210:8500: connect: connection refused"
vault-helm-values.yml
server:
affinity: ""
ha:
enabled: true
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "consul" {
path = "vault/"
address = "HOST_IP:8500"
}
any advice for getting through this problem ? Thanks in advance
Hi @jasonodonnell
Now vault seems to be running and unsealed but when I try to login I am seeing this error.
local node not active but active cluster node not found local node not active but active cluster node not found
I got the same issue . $ oc exec -it vault-0 vault status Error checking seal status: Get "http://127.0.0.1:8200/v1/sys/seal-status": dial tcp 127.0.0.1:8200: connect: connection refused command terminated with exit code 1 when I look into pod logs I can see below 2020-11-04T12:23:17.945Z [WARN] storage migration check error: error="Get "http://10.160.225.18:8500/v1/kv/vault/core/migration": dial tcp 10.160.225.18:8500: connect: connection refused" So what I understood is , 10.160.225.18(HOST_IP) is my worker node where consul server pod is running , vault is not connecting to consul server with HOST_IP with 8500 port number , and below is my values .yaml storage "consul" { path = "vault/" address = "HOST_IP:8500"
what the work around I did was , I Changed HOST_IP:8500 to my consul SERVICE_IP , hence it is headless service , service ip not generated so I have given my consul service name , in my case my consul service name is "consul-server" balaji@DESKTOP-O8C6N39:~/vault$ oc get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE consul-dns ClusterIP 172.21.33.6 53/TCP,53/UDP 2d consul-server ClusterIP None 8500/TCP,8301/TCP,8301/UDP,8302/TCP,8302/UDP,8300/TCP,8600/TCP
values.yaml
storage "consul" { path = "vault" address = "consul-server:8500"
Then vault was deployed and working fine SO finally my vault HA with backend consul storage is working perfectly .
Hi @jasonodonnell , I am using consul azure managed app server and i have installed consul agent on aks.
kubectl get svc -n consul
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
consul-connect-injector-svc ClusterIP 10.0.252.97 <none> 443/TCP 3d13h
consul-controller-webhook ClusterIP 10.0.169.80 <none> 443/TCP 3d13h
as i am using consul agent i donot see consul-server running . Helm chart for consul config
global:
enabled: false
name: consul
datacenter: dc1
acls:
manageSystemACLs: true
bootstrapToken:
secretName: XXX-sandbox-managed-app-bootstrap-token
secretKey: token
gossipEncryption:
secretName: XXX-sandbox-managed-app-hcs
secretKey: gossipEncryptionKey
tls:
enabled: true
enableAutoEncrypt: true
caCert:
secretName: XXX-sandbox-managed-app-hcs
secretKey: caCert
externalServers:
enabled: true
hosts:
['XXX.az.hashicorp.cloud']
httpsPort: 443
useSystemRoots: true
k8sAuthMethodHost: https://XXX.uksouth.azmk8s.io:443
client:
enabled: true
# If you are using Kubenet in your AKS cluster (the default network),
# uncomment the line below.
# exposeGossipPorts: true
join:
['XXX.az.hashicorp.cloud']
connectInject:
enabled: true
controller:
enabled: true
Helm chart for vault config
ui:
enabled: true
serviceType: LoadBalancer
server:
ingress:
enabled: true
extraPaths:
- path: /
backend:
serviceName: vault-ui
servicePort: 8200
hosts:
- host: something.com
ha:
enabled: true
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "consul" {
path = "vault/"
scheme = "https"
address = "HOST_IP:8500"
}
Error in vault pod which is unable to connect to consul agent.
kubectl logs vault-0 -n vault
WARNING! Unable to read storage migration status.
2021-06-28T08:13:13.041Z [INFO] proxy environment: http_proxy="" https_proxy="" no_proxy=""
2021-06-28T08:13:13.042Z [WARN] storage migration check error: error="Get "https://127.0.0.1:8500/v1/kv/vault/core/migration": dial tcp 127.0.0.1:8500: connect: connection refused"
i am not sure if some configuration is missed in consul helm chart as i do not see any service running on port 8500 in consul namespace.
Any suggestion would be much appreciated.
Thanks, pooja
I was able to get this working by deploying consul server as a deployments of its own. I didn't enable consul-client because I don't want a deamonset deployed. This uses agent tokens, and consul policies (might not be required depending on how you deploy).
Instead, in the vault chart, I set these in the appropriate place:
extraInitContainers:
- name: consul-config-writer
image: "alpine"
command: [sh, -c]
resources:
requests:
memory: 256Mi
cpu: 250m
limits:
memory: 256Mi
cpu: 250m
args:
- 'cd /consul-config && echo "{ \"primary_datacenter\": \"dc1\", \"acl\" : { \"enabled\": true, \"default_policy\": \"allow\", \"down_policy\": \"extend-cache\", \"enable_token_persistence\": true, \"tokens\": { \"default\": \"<agent_token>\" } } }" | tee agent.json && ls -l /consul-config'
volumeMounts:
- name: consul-config
mountPath: /consul-config
# extraContainers is a list of sidecar containers. Specified as a YAML list.
extraContainers:
- name: consul
image: hashicorp/consul:1.10.0
volumeMounts:
- name: consul-config
mountPath: /consul-config
args:
- /bin/consul
- agent
- -join
- consul-consul-server
- -data-dir=/tmp/consul
- -encrypt
- <gossip_key>
- -config-file=/consul-config/agent.json
storage "consul" {
path = "vault"
address = "127.0.0.1:8500"
}
My vault and consul pods are in the same namespace. If you're using different namespaces, then you'll probably need to provide the full service FQDN for your consul agent to join.
Vault should have consul client running, But, consul is not running,
I have deployed Vault on k8s in HA and want to use consul as storage. The pods are not running.
Logs
Describing the Pod
Also, I am receiving error even while passing the setting the variable env why?