elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.07k stars 24.83k forks source link

Add support to configure Proxy to allow geoip databases update #77069

Open fludo opened 3 years ago

fludo commented 3 years ago

When starting the elasticsearch, it attemps to update the geoip database (since 7.14).

An error occurs if it is behind a proxy. Apparently, we cannot configure any proxy to allow accessing a remote URL if behind a proxy.

The complete log is :

[2021-08-27T16:31:01,744][INFO ][o.e.i.g.GeoIpDownloader  ] [***] fetching geoip databases overview from [https://geoip.elastic.co/v1/database?elastic_geoip_service_tos=agree]
[2021-08-27T16:31:11,770][ERROR][o.e.i.g.GeoIpDownloader  ] [***] exception during geoip databases update
java.net.SocketTimeoutException: Connect timed out
    at sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:546) ~[?:?]
    at sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:597) ~[?:?]
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:333) ~[?:?]
    at java.net.Socket.connect(Socket.java:645) ~[?:?]
    at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:290) ~[?:?]
    at sun.net.NetworkClient.doConnect(NetworkClient.java:177) ~[?:?]
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:497) ~[?:?]
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:600) ~[?:?]
    at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:265) ~[?:?]
    at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:379) ~[?:?]
    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:189) ~[?:?]
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1232) ~[?:?]
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1120) ~[?:?]
    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:175) ~[?:?]
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1653) ~[?:?]
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1577) ~[?:?]
    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527) ~[?:?]
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:308) ~[?:?]
    at org.elasticsearch.ingest.geoip.HttpClient.lambda$get$0(HttpClient.java:55) ~[ingest-geoip-7.14.0.jar:7.14.0]
    at java.security.AccessController.doPrivileged(AccessController.java:554) ~[?:?]
    at org.elasticsearch.ingest.geoip.HttpClient.doPrivileged(HttpClient.java:97) ~[ingest-geoip-7.14.0.jar:7.14.0]
    at org.elasticsearch.ingest.geoip.HttpClient.get(HttpClient.java:49) ~[ingest-geoip-7.14.0.jar:7.14.0]
    at org.elasticsearch.ingest.geoip.HttpClient.getBytes(HttpClient.java:40) ~[ingest-geoip-7.14.0.jar:7.14.0]
    at org.elasticsearch.ingest.geoip.GeoIpDownloader.fetchDatabasesOverview(GeoIpDownloader.java:117) ~[ingest-geoip-7.14.0.jar:7.14.0]
    at org.elasticsearch.ingest.geoip.GeoIpDownloader.updateDatabases(GeoIpDownloader.java:105) ~[ingest-geoip-7.14.0.jar:7.14.0]
    at org.elasticsearch.ingest.geoip.GeoIpDownloader.runDownloader(GeoIpDownloader.java:237) [ingest-geoip-7.14.0.jar:7.14.0]
    at org.elasticsearch.ingest.geoip.GeoIpDownloaderTaskExecutor.nodeOperation(GeoIpDownloaderTaskExecutor.java:89) [ingest-geoip-7.14.0.jar:7.14.0]
    at org.elasticsearch.ingest.geoip.GeoIpDownloaderTaskExecutor.nodeOperation(GeoIpDownloaderTaskExecutor.java:38) [ingest-geoip-7.14.0.jar:7.14.0]
    at org.elasticsearch.persistent.NodePersistentTasksExecutor$1.doRun(NodePersistentTasksExecutor.java:40) [elasticsearch-7.14.0.jar:7.14.0]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732) [elasticsearch-7.14.0.jar:7.14.0]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.14.0.jar:7.14.0]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]

This request may be related to 75026 if ES_JAVA_OPTS could be used.

elasticmachine commented 3 years ago

Pinging @elastic/es-data-management (Team:Data Management)

noliono commented 2 years ago

I successfully use a proxy without authentication to update geoip database with ES_JAVA_OPTS with this 2 kinds of method :

or

So the ES_JAVA_OPTS seems to work now.

But I would like too to use a proxy with authentication and the following ES_JAVA_OPTS didn't work : "ES_JAVA_OPTS=-Dhttps.proxyUser=xxxx -Dhttps.proxyPassword=yyyyy -Dhttps.proxyHost=10.x.x.x -Dhttps.proxyPort=3128 -Djdk.http.auth.tunneling.disabledSchemes= -Djdk.https.auth.tunneling.disabledSchemes="

Did someone succeed or have any advise ?

jakelandis commented 2 years ago

It is splitting hair's but I have updated the issue to be a bug not an enhancement since the HTTP clients should be able to be configured via Elasticsearch settings. From which HTTP clients should have general support for HTTP proxy, custom trust stores, and a small handful of other common configurations.

Bernhard-Fluehmann commented 2 years ago

I am dealing with this problem as well and was surprised that this feature is missing. One thing that puzzled me was there is already proxy support for watcher. The strange thing about these parameters is that they are not named after watcher. Thus, with a setting like xpack.http.proxy.host in elasticsearch.yml one could assume that it is generally applied to elasticsearch http clients like geoip database download etc. and not only watcher.

smalenfant commented 2 years ago

The documentation should say "reverse proxy" instead of Proxy for the endpoint. We already have xpack.http.proxy.host set and that should just work as-is.

smalenfant commented 2 years ago

I set a reverse proxy and it doesn't work either. The problem is that it fetches the metadata that contains endpoint to storage.googleapis.com. One would need to manipulate the metadata file to get this working.