pires / docker-elasticsearch

Dockerfile for a base Elasticsearch image to be extended by others (allow to install plug-ins, change configuration, etc.)
Apache License 2.0
161 stars 173 forks source link

run.sh: retry downloading plugins upon failure #55

Closed Quentin-M closed 6 years ago

Quentin-M commented 6 years ago

In certain conditions, the network may not entirely ready when the container starts. This prevents the plugins from being downloaded, which may not be desirable as liveness probes won't pick up the fact that ES does not run all the features that were expected to be present (e.g. prometheus metrics exporting).

Instead, this commit proposes retries upon installation failures. In Kubernetes, the pod will eventually get killed and restarted by the liveness probs after the initialDelaySeconds delay has expired.

The following happens pretty frequently on k8s 1.9+ w/ Weave CNI & CoreDNS. This makes Prometheus and the Alert Manager unhappy, as they consider that the elastic-search cluster is out.

-> Downloading https://distfiles.compuscene.net/elasticsearch/elasticsearch-prometheus-exporter-6.1.3.0.zip
Exception in thread "main" java.net.UnknownHostException: distfiles.compuscene.net
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1944)
        at sun.net.www.protocol.http.HttpURLConnection$10.run(HttpURLConnection.java:1939)
        at java.security.AccessController.doPrivileged(Native Method)
        at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1938)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1508)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
        at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:263)
        at org.elasticsearch.plugins.InstallPluginCommand.downloadZip(InstallPluginCommand.java:326)
        at org.elasticsearch.plugins.InstallPluginCommand.download(InstallPluginCommand.java:245)
        at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:213)
        at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:204)
        at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
        at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:75)
        at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
        at org.elasticsearch.cli.Command.main(Command.java:90)
        at org.elasticsearch.plugins.PluginCli.main(PluginCli.java:48)
Caused by: java.net.UnknownHostException: distfiles.compuscene.net
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:673)
        at sun.security.ssl.BaseSSLSocketImpl.connect(BaseSSLSocketImpl.java:173)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
        at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264)
        at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
        at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
        at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1564)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
        at sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:3000)
        at java.net.URLConnection.getHeaderFieldLong(URLConnection.java:629)
        at java.net.URLConnection.getContentLengthLong(URLConnection.java:501)
        at java.net.URLConnection.getContentLength(URLConnection.java:485)
        at sun.net.www.protocol.https.HttpsURLConnectionImpl.getContentLength(HttpsURLConnectionImpl.java:407)
        at org.elasticsearch.plugins.InstallPluginCommand.downloadZip(InstallPluginCommand.java:325)
        ... 9 more
Quentin-M commented 6 years ago

I'll be happy to make that a variable with a default. However note that until will retry until the command actually succeeds, so it's mostly only about how many time the echo occurs at this point.

pires commented 6 years ago

Nevermind, I overlooked the until.

pires commented 6 years ago

Thank you so much!