Closed jchauncey closed 7 years ago
@rimusz is a potential reviewer of this pull request based on my analysis of git blame
information. Thanks @jchauncey!
With respect to Test InfluxDB deployment resource is deleted after upgrade/install
, all appears well (expectations met) with the following caveat: the helm upgrade
command technically errors out and exits non-zero:
$ helm upgrade deis-workflow workflow-pr/workflow --version v2.12.1-20170330220532-sha.77e675e --set global.influxdb_location=off-cluster
Release "deis-workflow" has been upgraded. Happy Helming!
Error: deployments.extensions "deis-workflow-influxdb" not found
$ echo $?
1
Possible to avoid erroring out in this scenario?
W/r/t Test InfluxDB persistence Migration
:
I tested at the level of a full Workflow install and although I am meeting the expectations listed in the description (new volume locations for storing influxdb data), after Workflow upgrade my grafana instance cannot locate the migrated data (dashboards empty, grafana pod logs show 2017/03/30 22:38:07 http: proxy error: dial tcp 10.131.246.29:80: i/o timeout
)
Here were my steps: https://gist.github.com/vdice/d0325647ad4136feb76fa8e9e0a0725e
W/r/t Test existing installation with no persistence and upgrade to influxdb with persistence
:
When attempting to upgrade, the expectations are not met; namely, the migrate-data
job/pod does not finish (see below) and the upgrade fails:
$ helm upgrade --wait deis-workflow workflow-pr/workflow --version v2.12.1-20170330220532-sha.77e675e -f values-only-influxdb-persistent.yaml
Error: UPGRADE FAILED: timed out waiting for the condition
$ kd describe po migrate-data-5zzmh-msld4
...
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
2m 1s 11 {default-scheduler } Warning FailedScheduling [SchedulerPredicates failed due to persistentvolumeclaims "deis-monitor-influxdb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "deis-monitor-influxdb" not found, which is unexpected., SchedulerPredicates failed due to persistentvolumeclaims "deis-monitor-influxdb" not found, which is unexpected.]
This PR is waiting on an upstream change before it can be tested.
With the k8s bug around binding persistent volumes on pods that have moved nodes I think I will just close this PR for now.
closes #180
Manual Testing Steps:
Notes
influxdb
pod gets stuck in aContainerCreating
state it's normally because the pod has moved from the node that originally hosted thepvc
. If you kill the pod it will be reassigned and should start normally. I am going to work on a hook that will automatically do this once I get the majority of this PR done.Prereqs
logger-redis-cache
secret (you can get this from the workflow chart)2.3.0
of tiller (helm init --upgrade
)Test InfluxDB persistence Migration
Install master of
deis/monitor
helm upgrade deis-monitor . --install --namespace deis --set influxdb.persistence.enabled=true
kubectl exec
into the running influxdb pod. You will notice that all the data in stored at/data/*
Install this PR of
deis/monitor
helm dependency build
incharts/monitor
helm upgrade deis-monitor . --namespace deis --set influxdb.persistence.enabled=true
upgrade
is taking place you can do akubectl get pods --watch --namespace=deis
and watch thekill-pods
andmigrate-data
pods do their thing.kubectl exec
into the new running influxdb pod. You should now see data in/var/lib/influxdb/data/*
Test InfluxDB deployment resource is deleted after upgrade/install
We can no longer just turn off parts of the chart. Instead we must delete the deployment resource after install/upgrade. The biggest problem here is that the influxdb chart programmatically builds its name so we rely on the fact that most deployments should be
{{ .Release.Name }}-influxdb
.Install
This is to validate that a clean install of this chart will do the right thing with off cluster influx.
helm upgrade deis-monitor . --install --namespace deis --set global.influxdb_location=off-cluster,influxdb.url=http://some.other.url
kubectl exec
into a running telegraf pod. Check thatconfig.toml
has your influxdb url set.Upgrade
This is to validate that when upgrading from an existing install (no matter its configuration) it will do the correct thing. It should be noted that we will kill off the telegraf pods and let them restart so they can pick up the new configuration of off cluster influx support in case the user was not usign that previously.
helm upgrade deis-monitor . --namespace deis --set global.influxdb_location=off-cluster,influxdb.url=http://some.other.url
kubectl exec
into a running telegraf pod. Check thatconfig.toml
has your influxdb url set.Test No Persistence -> No Persistence upgrade
Install master of
deis/monitor
helm upgrade deis-monitor . --install --namespace deis
\Install this PR of
deis/monitor
helm dependency build
incharts/monitor
helm upgrade deis-monitor . --namespace deis
Test older tiller vs New tiller
Install a version older than 2.3.0 of tiller
make build
./bin/tiller
helm upgrade deis-monitor . --namespace deis --host localhost:44134 --set influxdb.persistence.enabled=true
kubectl get jobs