Closed kenyee closed 1 year ago
FYI, we found a workaround...set the connectionPool in the builder so it uses a new connection pool w/ a size of zero and also turn off HTTP/2 support by setting a new protocolList in the builder with only HTTP/1.1 support.
You’re using 3.6.0?
yep...3.6.0 unfortunately. Thought about rolling back to pre-http/2 support but that would mean 2.2 which is too far back because of all the okhttp3 dependencies :-(
Oh that's terrible. We've had problems with similar failures before but I thought we'd fixed ’em all. If you can make a test case that'd be handy, otherwise I'll try to look soon.
In the interim you can disable HTTP/2 with the protocols list in the OkHttpClient.Builder.
Correct me if I'm wrong, but I think part of this is working as expected. HTTP/2 connections can carry N outstanding requests. If one of those requests times out and the HTTP/2 connection is closed, then the other N - 1 requests are also lost. I think the intent is that for HTTP/2 connections, a timeout does not necessarily mean the connection is bad.
Is it surprising to 'bring the network down' and not receive any sort of socket exception reading or writing?
N-1 requests being lost is fine if the connection is down. The issue is that it doesn't recover when you bring the network back up...i.e., the broken idle connection objects are in the pool stay there and when you try connecting again, you can't connect until the user kills off your app to restart everything...
@swankjesse : I couldn't figure out how to write a test for this because making all the sockets disconnected was happening at at an OS level. Tried to write and Android Test Butler one (to flip the network switch on/off on an Android emulator) but the current version of that has issues and probably wouldn't work in this code base :-)
So our attempts to write to the socket are failing silently? Might need to steal the automatic pings that we added for web sockets.
Essentially...not that they're failing silently, but they're dead sockets and they're stuck in the pool. We traced through a bit of the code and saw some code that was pulling the a dead socket out of the pool each time it tried to use one which should have cleared things up after 5 dead sockets were pulled out but the network layer still appeared stuck unless we purged the pool w/ evictAll() or waited for the 5 min eviction timeout. Wasn't obvious what a proper fix was... HTTP/2 essentially behaves like web sockets so you're probably on the right track...
Pretty sure this issue is another manifestation of this one:
I'm sure it's not. We don't see any SSL Handshake exceptions.
This bug is actually probably two bugs because we had to disable the connection pool and the HTTP/2 support. #3118 might be affected by the connection pool bug (it doesn't clear the broken idle connection objects in the pool).
I've seen what you've described but then also the ssl exceptions. Same steps to reproduce as you outlined.
any updates? I've same issue
The workaround I described works in our QA testing so far :-)
@kenyee setting new pool works, but I wonder when an update will arrive?
Is this issue resolved? I ran into the same issue using 3.5.0. I am using OkHttp to send push to Apple http/2. Yesterday I had this issue resulting in almost 80k push messages not getting delivered.
Caused by: java.net.SocketTimeoutException: timeout
at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.java:587) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.java:595) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http2.Http2Stream.getResponseHeaders(Http2Stream.java:140) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http2.Http2Codec.readResponseHeaders(Http2Codec.java:115) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:54) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[okhttp-3.5.0.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[okhttp-3.5.0.jar:?]
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179) ~[okhttp-3.5.0.jar:?]
at okhttp3.RealCall.execute(RealCall.java:63) ~[okhttp-3.5.0.jar:?]
After I got this error, none of my other requests succeeded.
Code:
KeyStore ks = KeyStore.getInstance("PKCS12");
ks.load(new ByteArrayInputStream("/foo/bar/mycert"), password.toCharArray());
KeyManagerFactory kmf = KeyManagerFactory.getInstance(KeyManagerFactory.getDefaultAlgorithm());
kmf.init(ks, password.toCharArray());
KeyManager[] keyManagers = kmf.getKeyManagers();
SSLContext sslContext = SSLContext.getInstance("TLS");
final TrustManagerFactory tmf = TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());
tmf.init((KeyStore) null);
sslContext.init(keyManagers, tmf.getTrustManagers(), null);
TrustManager[] trustManagers = tmf.getTrustManagers();
if (trustManagers != null && (trustManagers.length != 1 || !(trustManagers[0] instanceof X509TrustManager))) {
throw new IllegalStateException("Unexpected default trust managers:"
+ Arrays.toString(trustManagers));
}
final X509TrustManager trustManager = (X509TrustManager) trustManagers[0];
final SSLSocketFactory sslSocketFactory = sslContext.getSocketFactory();
OkHttpClient.Builder builder = new OkHttpClient.Builder();
builder.connectTimeout(5, TimeUnit.SECONDS).writeTimeout(10, TimeUnit.SECONDS).readTimeout(10, TimeUnit.SECONDS);
builder.connectionPool(new ConnectionPool(3, 10, TimeUnit.MINUTES));
builder.sslSocketFactory(sslSocketFactory, trustManager);
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(proxyHost, proxyPort));
builder.proxy(proxy);
OkHttpClient client = builder.build();
As socket timeout exception is an instance of IO exception, I am not sure if the following approach will work. Can one of you pls get back to me?
I am calling evictAll() in the catch block of IOException.
try {
response = client.newCall(request).execute();
statusCode = response.code();
responseBody = response.body().string();
} catch (IOException ioe) {
client.connectionPool().evictAll();
} finally {
if (response != null) {
response.body().close();
}
}
Also how do we check if a connection is stale or not?
With Apache HttpClient, there is a way to do it to set a flag for checking stale connections. Wondering how OkHttp3 checks for it internally before it uses the connection.
CloseableHttpClient client = HttpClients.custom().setDefaultRequestConfig(
RequestConfig.custom().setStaleConnectionCheckEnabled(true).build()
).setConnectionManager(connManager).build();
Any updates? I have the same issue too. :(
Same issue here!
We still experiencing the same issue :-(
I think i'm seeing another manifestation of this on 3.5.0, when the server forcibly closes the connection.
We try to establish both a h2 and http1.1 connection. The server responds with 200 to both:
06-26 15:07:55.286 22094 22380 I okhttp3.OkHttpClient: --> GET<url> http/1.1
06-26 15:07:55.524 22094 22380 I okhttp3.OkHttpClient: --> GET<url> h2
06-26 15:07:55.596 22094 22380 I okhttp3.OkHttpClient: <-- 200 <url> (71ms)
06-26 15:07:55.597 22094 22380 I okhttp3.OkHttpClient: <-- 200 <url> (303ms)
Then at some point we try to read from the http2 connection, which fails in checkNotClosed and throws a StreamResetException
06-26 15:06:01.560 22094 22126 I MyProject: Caused by: okhttp3.internal.http2.StreamResetException: stream was reset: PROTOCOL_ERROR
06-26 15:06:01.560 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream$FramedDataSource.checkNotClosed(Http2Stream.java:428)
06-26 15:06:01.560 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream$FramedDataSource.read(Http2Stream.java:330)
06-26 15:06:01.560 22094 22126 I MyProject: at okio.ForwardingSource.read(ForwardingSource.java:35)
06-26 15:06:01.560 22094 22126 I MyProject: at okio.RealBufferedSource$1.read(RealBufferedSource.java:409)
06-26 15:06:01.560 22094 22126 I MyProject: at com.google.android.exoplayer.upstream.HttpDataSource.read(HttpDataSourceImpl.java:699)
06-26 15:06:01.560 22094 22126 I MyProject: at com.google.android.exoplayer.upstream.HttpDataSource.read(HttpDataSourceImpl.java:424)
Then, since this is media, we do something that causes a seek to 0 in the media, which needs to reopen the request from the beginning. At this point, we see the same exception as is posted above:
06-26 15:08:39.387 22094 22126 I MyProject: Caused by: java.net.SocketTimeoutException: timeout
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream$StreamTimeout.newTimeoutException(Http2Stream.java:587)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream$StreamTimeout.exitAndThrowIfTimedOut(Http2Stream.java:595)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Stream.getResponseHeaders(Http2Stream.java:140)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http2.Http2Codec.readResponseHeaders(Http2Codec.java:115)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:54)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:212)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.logging.HttpLoggingInterceptor.intercept(HttpLoggingInterceptor.java:212)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179)
06-26 15:08:39.387 22094 22126 I MyProject: at okhttp3.RealCall.execute(RealCall.java:63)
this seems to be very similar to the other cases here, which seem to all be related to an ungraceful shutdown of the connection, and it remaining pooled.
I've also confirmed that disabling the ConnectionPool "works around" this issue:
OkHttpClient.Builder clientBuilder = new OkHttpClient.Builder()
.connectTimeout(connectTimeoutMillis, TimeUnit.MILLISECONDS)
.retryOnConnectionFailure(true)
.readTimeout(readTimeoutMillis, TimeUnit.MILLISECONDS).connectionPool(new ConnectionPool(0, 1, TimeUnit.NANOSECONDS));
is thr any update on this issue?
Same issue here
any idea if/when this will be fixed? We're seeing the same issue.
@jpearl
I can confirm that disabling ConnectionPool stops getting StreamResetException
when it uses HTTP/2.
I'm also using ExoPlayer with OkHttp and in my case, it was happening when my app goes to background. Even if I turn off the Battery Optimizations it was being closed after few minutes in background getting SocketTimeoutException
when the player tried to play the next track.
I was thinking to use DefaultHttpDataSource for the requests in the ExoPlayer because it also works without throwing SocketTimeoutException
, but by disabling the ConnectionPool for me would better at this moment.
I'll keep it disabled for now until I find a better solution.
OkHttp: 3.8.1
Model: Huawei P9 Lite
OS: 7.1.2
Thanks for sharing this!
@alessandrojp do you know if the ExoPlayer team is aware of this issue? We've run into it only with exoplayer as well.
I'm also seeing SocketTimeout / dead client issues with OkHttp and Exoplayer.
evict all connection from connection pool,resolve sockettimeout exception if(throwable instanceof SocketTimeoutException){ okHttpClient.connectionPool().evictAll(); }
I am also facing similar issue. When my app goes in background and if internet goes off and comes back and after if i come back to the app, nothing loads and even request doesn't go and i get timeout. After that all requests behave same. Please help me in solving this issue.
For now i have put a hack by reading the above thread is that whenever i get IOException i evict all connections from connection pool. This solves the problem but this happens at least once and user sees retry/reload screen.
@swankjesse Any news on this?
We have implemented the described a workaround, but it still causes us to hit the exceptions, which is not ideal. Would love to hear if this will be addressed soon
Any updates for a fix?
It would be great to get a fix for this. Any release date?
Will implement with this: https://github.com/square/okhttp/issues/3261
Try THIS:
ConnectionSpec spec = new ConnectionSpec.Builder(ConnectionSpec.COMPATIBLE_TLS) .tlsVersions(TlsVersion.TLS_1_1) .build();
@swankjesse could you clarify if this is fixed? The issue you link to references a couple of merged PRs, but it's not obvious if they directly address this issue or not.
It is not fixed as far as I can tell.
When I checked the fix, it seemed to leave the dead connection in the pool until you try using it which causes an exception to propagate into your code. So after the network comes back up, the user would get a bunch of "could not connect" errors in our app (because it didn't handle retries properly for this case) and then it would work again. We decided to just leave the pool size at zero for now until we can rewrite parts of the code to handle this case. But at least the zombies don't stay stuck in the pool....
okhttp: 3.11 SocketTimeoutException is not fixed still its appears
it clears out on the second call. It looks like what happens is the pool gets zombie connections. Next time you grab one of the zombies out of the pool, it throws that exception but is removed. The original bug was that the zombies got stuck in the pool. That said, this isn't great behavior either, so we've just left the pool size at zero...
I can confirm this is still an issue.
@c0dehunter try setting a ping interval on your OkHttpClient?
@swankjesse Thanks for quick response. We tried setting pingInterval(1, TimeUnit.SECOND)
and it seems it is behaving properly now. I don't want to say it's fixed yet as we need to do more testing, but will report back after a few days.
I also can confirm that that issue is reproducing on okhttp v3.11.0 Looks like setting protocols(listOf(Protocol.HTTP_1_1)) should fix it for now
I'm just reporting back and my findings contradict others, so I am really not sure anymore. After updating to the latest version, we have not been able to reproduce this issue
I also reproduce this issue, please found a fix, any suggestion:
tracking this issue...
Tracking this issue as well. Here is my client in Android, still failing even after attempting the suggestions that worked for others:
@Provides
@Singleton
static OkHttpClient provideOkHttpClient() {
OkHttpClient.Builder client = new OkHttpClient.Builder()
.protocols(Arrays.asList(Protocol.HTTP_1_1))
.connectionPool(new ConnectionPool(0, 1,
TimeUnit.NANOSECONDS));
if (!EnvironmentConstants.IS_PRODUCTION) {
CurlLoggingInterceptor curlLoggingInterceptor = new CurlLoggingInterceptor();
curlLoggingInterceptor.setCurlOptions("-i");
client.addNetworkInterceptor(curlLoggingInterceptor)
.addInterceptor(chain -> {
Request request = chain.request().newBuilder()
.addHeader("Accept", "application/json").build();
return chain.proceed(request);
});
}
return client.build();
}
It is simple to reproduce: Successfully complete POST request. Turn on airplane mode. Attempt POST request and fail. Turn off airplane mode. Attempt POST request. That fails with:
08-06 21:42:02.321 25955-25955/tech.duchess.luminawallet.staging.debug E/SendPresenter: Transaction failed java.net.SocketTimeoutException: timeout . . . 08-06 21:42:02.323 25955-25955/tech.duchess.luminawallet.staging.debug E/SendPresenter: Caused by: java.net.SocketException: socket is closed at com.android.org.conscrypt.ConscryptFileDescriptorSocket$SSLInputStream.read(ConscryptFileDescriptorSocket.java:546)
All subsequent transactions continue to fail with the same issue. Restarting my application works fine, presumably due to the new OkHttp client.
Forgot to mention, I'm also on 3.11.0
Which version can be fixed?
You dont need to disable connectionPool, just insert inside your BroadcastReceiver when the network changes the following code
BroadcastReceiver networkStateReceiver = new BroadcastReceiver() {
@Override
public void onReceive(Context context, Intent intent) {
final ConnectivityManager connectivityManager = (ConnectivityManager) context.getSystemService(Context.CONNECTIVITY_SERVICE);
final NetworkInfo activeNetInfo = connectivityManager.getActiveNetworkInfo();
if (activeNetInfo != null) {
//clear here the offline pool when got online
getInstance().okClient.connectionPool().evictAll();
}
}
};
IntentFilter filter = new IntentFilter(ConnectivityManager.CONNECTIVITY_ACTION);
ApplicationLoader.appContext.registerReceiver(networkStateReceiver, filter);
Tried writing a unit test w/ TestButler on Android w/ no luck, so I'll write up the steps to reproduce this and include some sample code. This happens if you connect to an HTTP/2 server and your network goes down while the okhttp client is connected to it: 1) create an okhttp client 2) tell it to read from the HTTP/2 server 3) bring the network down 4) tell it to read from the HTTP/2 server (it'll get a SocketTimeoutException) 5) bring the network back up 6) tell it to read from the HTTP/2 server again (it'll be stuck w/ SocketTimeoutExceptions) 7) if you create new http clients at this point, it'll work, but the dead http client will eventually come back in the pool and fail.
okhttp client should attempt to reopen the HTTP/2 connection instead of being stuck in this state
Code sample for Android (create a trivial view w/ a button and a textview):