Open swankjesse opened 4 years ago
Apologies for reporting this here
Nah, don't worry. Unfortunately AOSP issues sometimes bounce around in triage for a while before they find their way to us.
Android 9's BoringSSL's X509_verify()
is rejecting one of your intermediate certificates but for some reason failing to set the error queue before returning. I skimmed the code path and I don't see any obvious ways it can return without doing that but I wasn't exhaustive. @davidben has been fixing issues around the error queue during verification recently so he might remember if there were issues here in the Android P timeframe.
The higher level code is trying to built a trust chain through the intermediate certs to a trust anchor, using either the certificates provided by the peer or in this case a cache of previously seen intermediates. Any cache hits get their verify()
method called and that's where the exception is getting thrown.
There are arguments for and against the intermediate cache (we recently discussed removing it), but the idea is to make things go better when not all servers for a site are not sending a full certificate chain.
So the sequence looks something like:
Intermediate
) which (a) fails verification (b) without generating a useful error message. It's possible that both (a) and (b) are fixed in Android 10, I haven't done all the archaeology. Certainly there have been changes in that area of TrustManagerImpl
.Leaf
) with issuer Intermediate
but the trust manager isn't able to build a chain of trust using the certificates provided by the server, which may be because Intermediate
wasn't included or because some other link in the chain isn't trusted.Intermediate
, finds it and explodes.If you're able to reproduce this yourself, the list of certificates passed into RootTrustManager.checkServerTrusted()
would be helpful, or failing that just point me at the server and I ought to be able to reproduce it from the certificate chain.
Note that X509Certificate.verify()
is just doing structural checks on the certificate itself, so somewhere in here there's an intermediate certificate that either BoringSSL thinks is invalid or that triggers a bug on Android 9 BoringSSL, or that Conscrypt is somehow corrupting. If it's one of the latter two then a timely fix for Android 9 is going to be difficult.
I'm not sure what the Android P timeframe would have been, but it might be:
https://github.com/google/conscrypt/issues/928 https://boringssl-review.googlesource.com/c/boringssl/+/32044/
Thanks, that looks promising! It didn't make it into Android 9: https://cs.android.com/android/platform/superproject/+/android-9.0.0_r34:external/boringssl/src/crypto/x509/x_all.c;bpv=1
I did noticed #537 which it refers to earlier and mistakenly thought that did make it into 9, but it doesn't look like it did: https://cs.android.com/android/platform/superproject/+/android-9.0.0_r34:external/conscrypt/common/src/jni/main/cpp/conscrypt/native_crypto.cc;l=4092;bpv=0 (or maybe it made it into a QPR but somehow isn't in the AOSP branch, otherwise Jesse would be seeing an assertion error).
@swankjesse - it it possible you are serving an intermediate that really is malformed?
@prbprbprb @davidben thanks for looking into this!
Well darn, I sliced off too much of the exception and it does have an assertion error. I hate that I stripped this on you! My bad. Anyway the full stacktrace will all our indirection and uninvolved libraries is here:
io.reactivex.exceptions.OnErrorNotImplementedException: The exception was not handled due to missing onError handler in the subscribe() method call. Further reading: https://github.com/ReactiveX/RxJava/wiki/Error-Handling | java.lang.AssertionError: throwExceptionFromBoringSSLError called with no error
at com.squareup.util.rx2.SubscribingKt$errorHandlingSubscribe$6.accept(SubscribingKt:92)
at com.squareup.cash.attribution.InstallAttributer$initializeWork$$inlined$errorHandlingSubscribe$1.accept(subscribing.kt:92)
at com.squareup.cash.attribution.InstallAttributer$initializeWork$$inlined$errorHandlingSubscribe$1.accept(subscribing.kt:1)
at io.reactivex.internal.observers.ConsumerSingleObserver.onError(ConsumerSingleObserver.java:46)
at io.reactivex.internal.operators.observable.ObservableSingleSingle$SingleElementObserver.onError(ObservableSingleSingle.java:93)
at io.reactivex.internal.operators.observable.ObservableFlatMapSingle$FlatMapSingleObserver.drainLoop(ObservableFlatMapSingle.java:239)
at io.reactivex.internal.operators.observable.ObservableFlatMapSingle$FlatMapSingleObserver.drain(ObservableFlatMapSingle.java:210)
at io.reactivex.internal.operators.observable.ObservableFlatMapSingle$FlatMapSingleObserver.innerError(ObservableFlatMapSingle:202)
at io.reactivex.internal.operators.observable.ObservableFlatMapSingle$FlatMapSingleObserver$InnerObserver.onError(ObservableFlatMapSingle.java:289)
at io.reactivex.internal.observers.ResumeSingleObserver.onError(ResumeSingleObserver.java:51)
at io.reactivex.internal.disposables.EmptyDisposable.error(EmptyDisposable:78)
at io.reactivex.internal.operators.single.SingleError.subscribeActual(SingleError.java:42)
at io.reactivex.Single.subscribe(Single.java:3666)
at io.reactivex.internal.operators.single.SingleResumeNext$ResumeMainSingleObserver.onError(SingleResumeNext.java:80)
at io.reactivex.internal.operators.single.SingleMap$MapSingleObserver.onError(SingleMap.java:69)
at io.reactivex.internal.operators.observable.ObservableSingleSingle$SingleElementObserver.onError(ObservableSingleSingle.java:93)
at io.reactivex.internal.operators.observable.ObservableSubscribeOn$SubscribeOnObserver.onError(ObservableSubscribeOn.java:63)
at retrofit2.adapter.rxjava2.CallExecuteObservable.subscribeActual(CallExecuteObservable.java:60)
at io.reactivex.Observable.subscribe(Observable.java:12284)
at io.reactivex.internal.operators.observable.ObservableSubscribeOn$SubscribeTask.run(ObservableSubscribeOn.java:96)
at io.reactivex.Scheduler$DisposeTask.run(Scheduler.java:578)
at io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66)
at io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:301)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.lang.Thread.run(Thread.java:764)
Caused by: java.lang.AssertionError: throwExceptionFromBoringSSLError called with no error
at com.android.org.conscrypt.NativeCrypto.X509_verify(NativeCrypto.java:-2)
at com.android.org.conscrypt.OpenSSLX509Certificate.verifyOpenSSL(OpenSSLX509Certificate.java:379)
at com.android.org.conscrypt.OpenSSLX509Certificate.verify(OpenSSLX509Certificate.java:409)
at com.android.org.conscrypt.TrustedCertificateIndex.findAllByIssuerAndSignature(TrustedCertificateIndex.java:197)
at com.android.org.conscrypt.TrustManagerImpl.checkTrustedRecursive(TrustManagerImpl.java:618)
at com.android.org.conscrypt.TrustManagerImpl.checkTrusted(TrustManagerImpl.java:495)
at com.android.org.conscrypt.TrustManagerImpl.checkTrusted(TrustManagerImpl.java:418)
at com.android.org.conscrypt.TrustManagerImpl.getTrustedChainForServer(TrustManagerImpl.java:339)
at android.security.net.config.NetworkSecurityTrustManager.checkServerTrusted(NetworkSecurityTrustManager.java:94)
at android.security.net.config.RootTrustManager.checkServerTrusted(RootTrustManager.java:88)
at com.android.org.conscrypt.Platform.checkServerTrusted(Platform.java:208)
at com.android.org.conscrypt.ConscryptFileDescriptorSocket.verifyCertificateChain(ConscryptFileDescriptorSocket.java:404)
at com.android.org.conscrypt.NativeCrypto.SSL_do_handshake(NativeCrypto.java:-2)
at com.android.org.conscrypt.NativeSsl.doHandshake(NativeSsl.java:375)
at com.android.org.conscrypt.ConscryptFileDescriptorSocket.startHandshake(ConscryptFileDescriptorSocket.java:224)
at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.kt:379)
at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.kt:337)
at okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:209)
at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226)
at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106)
at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder:74)
at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall:255)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.kt:95)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.kt:83)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.kt:76)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at com.squareup.cash.integration.api.CashApiInterceptor.intercept(CashApiInterceptor.kt:137)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at com.squareup.cash.api.GzipRequestBodyInterceptor.intercept(GzipRequestBodyInterceptor.kt:26)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at com.squareup.cash.api.TimeoutOverrideInterceptor.intercept(TimeoutOverrideInterceptor.kt:22)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp(RealCall.kt:201)
at okhttp3.internal.connection.RealCall.execute(RealCall.kt:154)
at retrofit2.OkHttpCall.execute(OkHttpCall.java:204)
at com.nightlynexus.retrofit.logging.LoggingCallAdapterFactory$LoggingCall.execute(LoggingCallAdapterFactory.java:156)
at com.nightlynexus.retrofit.logging.LoggingCallAdapterFactory$LoggingCall.execute(LoggingCallAdapterFactory.java:156)
at retrofit2.adapter.rxjava2.CallExecuteObservable.subscribeActual(CallExecuteObservable.java:46)
at io.reactivex.Observable.subscribe(Observable.java:12284)
at io.reactivex.internal.operators.observable.ObservableSubscribeOn$SubscribeTask.run(ObservableSubscribeOn.java:96)
at io.reactivex.Scheduler$DisposeTask.run(Scheduler.java:578)
at io.reactivex.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:66)
at io.reactivex.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:57)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:301)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.lang.Thread.run(Thread.java:764)
OK, so neither of those changes made it into Android 9.
I'd still like to see the certificate chain so we can figure out if there's a bug around signature algorithm matching or intermediate caching in Android 9.
Apologies for reporting this here instead of the AOSP tracker. In my experience that tracker is a black hole!
We saw this crash across many different app releases and device manufacturers. The only thing in common was it was always Android 9.
Looks like the offending line of code is here in Conscrypt:
The exception is:
The crashes coincide with us updating our TLS certificates. The new certificates share the same root and intermediary as the previous certificate. The leaf certificate has a different expiration date.
Is it possible that Android 9’s Conscrypt crashes as-above when certificates are rotated? Perhaps it was remembering the certificate from a previous HTTPS connection? Is Conscrypt configured differently in any meaningful way in Android 9 vs. 8 or 10?