Open weizijun opened 5 years ago
Pinging @elastic/es-core-features
I face some other case that client status change to STOPPED.
What other cases exactly?
The one you've provided (OutOfMemoryError) is not an error that you can be sure to recover from. If you get an OutOfMemoryError then the best course of action is to restart the JVM.
I face some other case that client status change to STOPPED.
What other cases exactly?
The one you've provided (OutOfMemoryError) is not an error that you can be sure to recover from. If you get an OutOfMemoryError then the best course of action is to restart the JVM.
Other case is rare appear. AND not find the reason and phenomenon.
I am also running into this error. Running v6.8 on a 10 node cluster (each node 32cpu, 64gb ram) in azure. Error occurs shortly after application startup.
My application can make an initial connection to my elastic search cluster. On startup, my application loads the contents of two indices into an in memory cache and refreshes that cache every 10 minutes. The size of those two indices are 3.9 MB and 36.4 MB. My application has no issues loading these indices into its in-memory cache for the first two cache refreshes. However, it the following Out-Of-Memory error is thrown on the third cache refresh:
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.client.InternalRequestExecutor.inputReady(InternalRequestExecutor.java:83)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:336)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.consumeContent(DefaultClientExchangeHandlerImpl.java:157)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.client.MainClientExec.consumeContent(MainClientExec.java:329)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.nio.protocol.AbstractAsyncResponseConsumer.consumeContent(AbstractAsyncResponseConsumer.java:147)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.elasticsearch.client.HeapBufferedAsyncResponseConsumer.onContentReceived(HeapBufferedAsyncResponseConsumer.java:96)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.nio.util.SimpleInputBuffer.consumeContent(SimpleInputBuffer.java:66)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.codecs.LengthDelimitedDecoder.read(LengthDelimitedDecoder.java:84)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.codecs.AbstractContentDecoder.readFromChannel(AbstractContentDecoder.java:154)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] ... 1 common frames omitted
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:305)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.reactor.AbstractIOReactor.hardShutdown(AbstractIOReactor.java:576)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.reactor.AbstractIOReactor.processClosedSessions(AbstractIOReactor.java:440)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionClosed(BaseIOReactor.java:279)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.reactor.AbstractIODispatch.disconnected(AbstractIODispatch.java:100)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.client.InternalIODispatch.onClosed(InternalIODispatch.java:39)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.client.InternalIODispatch.onClosed(InternalIODispatch.java:71)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.impl.nio.client.InternalRequestExecutor.closed(InternalRequestExecutor.java:64)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.closed(HttpAsyncRequestExecutor.java:146)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] Caused by: org.apache.http.ConnectionClosedException: Connection closed unexpectedly
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at java.base/java.lang.Thread.run(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at com.dsg.foundation.components.RefreshingPopulatedCache.refreshCache(RefreshingPopulatedCache.java:81)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at com.dsg.microservices.catalog.product.service.CategoryDetailsCache.populateCache(CategoryDetailsCache.java:48)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at com.dsg.foundation.elasticsearch.ElasticTemplate.queryScroll(ElasticTemplate.java:252)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at com.dsg.foundation.elasticsearch.ScrollResultsIterator.<init>(ScrollResultsIterator.java:23)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at com.dsg.foundation.elasticsearch.ElasticTemplate.search(ElasticTemplate.java:267)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:730)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1231)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1256)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.elasticsearch.client.RestClient.performRequest(RestClient.java:227)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:933)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [OUT] org.apache.http.ConnectionClosedException: Connection closed unexpectedly
...
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.conn.LoggingIOSession$LoggingByteChannel.read(LoggingIOSession.java:204)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at java.base/sun.nio.ch.SocketChannelImpl.read(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at java.base/sun.nio.ch.IOUtil.read(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at java.base/sun.nio.ch.Util.getTemporaryDirectBuffer(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at java.base/java.nio.ByteBuffer.allocateDirect(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at java.base/java.nio.DirectByteBuffer.<init>(Unknown Source)
2019-06-04T16:37:32.681-04:00 [APP/PROC/WEB/2] [ERR] at java.base/java.nio.Bits.reserveMemory(Unknown Source)
2019-06-04T16:37:32.682-04:00 [APP/PROC/WEB/2] [ERR] at java.base/java.lang.Thread.run(Unknown Source)
2019-06-04T16:37:32.682-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
2019-06-04T16:37:32.682-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
2019-06-04T16:37:32.682-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
2019-06-04T16:37:32.682-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
2019-06-04T16:37:32.682-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
2019-06-04T16:37:32.682-04:00 [APP/PROC/WEB/2] [ERR] at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
...
2019-06-04T16:37:33.781-04:00 [APP/PROC/WEB/11] [ERR] Exception in thread "I/O dispatcher 16" java.lang.OutOfMemoryError: Direct buffer memory
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at java.base/java.lang.Thread.run(Unknown Source)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.client.InternalRequestExecutor.inputReady(InternalRequestExecutor.java:83)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:336)
2019-06-04T16:37:33.782-04:00 [APP/PROC/WEB/11] [ERR] at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.consumeContent(DefaultClientExchangeHandlerImpl.java:157)
What can I do to resolve this? Many thanks in advance for any help.
There is actually a valid issue hidden here I think (not necessarily in the specific exception here but in general).
We do not handle RuntimeException
s thrown in the onFailure
listener of a request callback.
So e.g. this code:
restClient.performRequestAsync(request, new ResponseListener() {
@Override
public void onSuccess(Response response) {
// ...
}
@Override
public void onFailure(Exception exception) {
throw new RuntimeException(exception);
}
});
will cause a broken client in the way we're seeing it here because the exception bubbles up in the IOReactor and it's being hard-shutdown as a result. IMO this is something we should fix. It's always possible that due to some unforeseen state/bug/... an exception is thrown in the failure handler. While that shouldn't happen and callback code should be fixed, I think it's not right to simply silently shut down the io reactor here and fail subsequent requests (there isn't even logging in the example code above). Instead, shouldn't we simply catch all exceptions and worst case log a warning for the unhandled ones and not let them break the client?
@original-brownbear I've opened a separate issue (#45115) based on your above comment. We agree that that issue needs addressing, but this specific issue looks like something else is amiss with allocating byte buffers in the client.
@JoeBerg8 Given the sizes of your index that you are trying to load you may be bumping up against the maximum amount of direct memory that NIO is allowed to reserve. Please take a look at tuning -XX:MaxDirectMemorySize=size
for your client application to see if that helps (the default is 64mb I believe).
@weizijun After you changed the MaxDirectMemorySize, have you been able to find any logs from when the IOReactor stops that you can share here?
@jbaiera yeah,I set MaxDirectMemorySize, and face the Direct buffer OOM.
@weizijun What size are you setting it to? If you set it higher does it still OOME?
@weizijun What size are you setting it to? If you set it higher does it still OOME?
In a Program, we set MaxDirectMemorySize=6g, but still face java.lang.OutOfMemoryError, and the client's status is STOPPED.
hi, I meet same problem. My java progrom is Running in docker and the error stack message is I/O reactor terminated abnormally java.lang.ArithmeticException: / by zero at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.addChannel(AbstractMultiworkerIOReactor.java:473) at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:178) at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:145) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:348) at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:192) at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) at java.lang.Thread.run(Thread.java:748)
java.lang.IllegalStateException: Request cannot be executed; I/O reactor status: STOPPED at org.apache.http.util.Asserts.check(Asserts.java:46) at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.ensureRunning(CloseableHttpAsyncClientBase.java:90) at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:123) at org.elasticsearch.client.RestClient.performRequestAsync(RestClient.java:554) at org.elasticsearch.client.RestClient.performRequestAsyncNoCatch(RestClient.java:537) at org.elasticsearch.client.RestClient.performRequest(RestClient.java:249) at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:568) at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:538) at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:500) at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:427) at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:416)
@weizijun What size are you setting it to? If you set it higher does it still OOME?
In a Program, we set MaxDirectMemorySize=6g, but still face java.lang.OutOfMemoryError, and the client's status is STOPPED.
@weizijun how did you solve the problem? Can you please share?
could you please share how you resolved the issue? I am facing the same issue. The client initiates when the app starts and closes when the app stops. I am using synchronous request in all cases.
I solved the issue by adding a check to my global method that I use to retrieve ES connection. Looks like the low level REST client is sometimes disconnected out of the blue and that the connection has to be reinitialized in such case.
private static RestHighLevelClient INSTANCE;
static final RestHighLevelClient es() throws RuntimeException {
if (INSTANCE == null || !INSTANCE.getLowLevelClient().isRunning()) {
esClient = new RestHighLevelClient(RestClient.builder(new HttpHost(config.getElasticsearch().getHostname(), config.getElasticsearch().getPort(), config.getElasticsearch().getScheme())).setHttpClientConfigCallback(httpClientBuilder -> httpClientBuilder));
}
return INSTANCE;
}
@mariuszpala Thank you a lot. Does that also rebuild the client, when its status is INACTIVE but not STOPPED?
Describe the feature:
Elasticsearch version (
bin/elasticsearch --version
): 6.6.1Plugins installed: []
JVM version (
java -version
): 1.8OS version (
uname -a
if on a Unix-like system): 3.10.0-514.16.1.es01.x86_64 #1 SMP Wed Oct 17 23:14:35 CST 2018 x86_64 x86_64 x86_64 GNU/LinuxDescription of the problem including expected versus actual behavior: Low-level REST client accidentally casue exception:
client status change to STOPPED from 2 case: 1、CloseableHttpAsyncClientBase.close() 2、CloseableHttpAsyncClientBase construction method
I sure that I don't call close() ,and one case is that :
after remove
XX:MaxDirectMemorySize
, I face some other case that client status change to STOPPED.as I issue in #42061 and other user issue in #39946. Can RestClient add a method to check if client is running? or reset client internal.