jetstack / navigator

Managed Database-as-a-Service (DBaaS) on Kubernetes
Apache License 2.0
271 stars 31 forks source link

Intermittent E2E test failure: Elasticsearch pilot did not update the document count #314

Closed wallrj closed 6 years ago

wallrj commented 6 years ago

In #277 I keep getting ElasticSearch E2E test failures:

ERROR: [1] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2018-04-05T09:53:11,796][INFO ][o.e.n.Node               ] [es-test-mixed-0] stopping ...

Maybe the daemonset introduced in #287 hasn't yet run?

/kind bug

wallrj commented 6 years ago

Ah, looks like things are getting stuck during prepare-e2e.sh

I0405 09:35:52.306] Waiting for tiller to be ready...
W0405 09:35:52.407] + echo 'Waiting for tiller to be ready...'
W0405 09:35:52.407] + retry TIMEOUT=60 helm version
W0405 09:35:52.407] + local TIMEOUT=60
W0405 09:35:52.407] + local SLEEP=10
W0405 09:35:52.407] + :
W0405 09:35:52.407] + case "${1}" in
W0405 09:35:52.408] + local TIMEOUT=60
W0405 09:35:52.408] + shift
W0405 09:35:52.408] + :
W0405 09:35:52.408] + case "${1}" in
W0405 09:35:52.408] + break
W0405 09:35:52.408] + local start_time
W0405 09:35:52.408] ++ date +%s
W0405 09:35:52.408] + start_time=1522920952
W0405 09:35:52.408] + local end_time
W0405 09:35:52.408] + end_time=1522921012
W0405 09:35:52.408] + helm version
I0405 09:35:52.509] Client: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}
W0405 09:40:52.483] Error: cannot connect to Tiller
W0405 09:40:52.486] + local exit_code=1
W0405 09:40:52.486] ++ date +%s
W0405 09:40:52.487] + local current_time=1522921252
W0405 09:40:52.487] + local remaining_time=-240
W0405 09:40:52.487] + [[ -240 -le 0 ]]
W0405 09:40:52.487] + return 1
W0405 09:40:52.490] + exec

helm version takes > 5 min to return. But the timeout for this step is only 60s.

We can increase that timeout and / or add a time limit to the helm commands.

But also need to fix make e2e-test so that it exits early if the prepare-e2e.sh script fails.