smithy-lang / smithy-kotlin

Smithy code generator for Kotlin (in development)
Apache License 2.0
83 stars 26 forks source link

`NetworkOnMainThreadException` in AWS SDK #1177

Open drayan85 opened 2 weeks ago

drayan85 commented 2 weeks ago

Describe the bug

  suspend fun getSuggestions(query: String): SearchPlaceIndexForSuggestionsResponse? {
    return withContext(Dispatchers.IO) {
      runCatching {
        val request = SearchPlaceIndexForSuggestionsRequest {
          text = query
          indexName = "placeIndex"
          maxResults = 5
        }
        getLocationClient().searchPlaceIndexForSuggestions(request)
      }.onFailure { e ->
        Timber.e(e, e.message)
      }.getOrDefault(null)
    }
  }

  suspend fun getLocationClient(): LocationClient {
    return runCatching {
      val locationCredentialsProvider: LocationCredentialsProvider = AuthHelper(context).authenticateWithCognitoIdentityPool("COGNITO_POOL_ID")
      locationCredentialsProvider.getLocationClient() ?: error("AWS:CLIENT Location Client Retrieval failed")
    }.getOrElse { e ->
      Timber.e(e, e.message)
      error("AWS:COGNITO authentication failed")
    }
  }

We are making location suggestion call from the AWS SDK using IO dispatcher, however we see a crash inside the AWS SDK that Network Operation is being done on Main Thread which is resulting our app to crash.

We are noticing the crash occasionally.

Regression Issue

Expected behavior

It should not crash the app

Current behavior

It is crashing the app.

          Caused by android.os.NetworkOnMainThreadException:
       at android.os.StrictMode$AndroidBlockGuardPolicy.onNetwork(StrictMode.java:1565)
       at com.android.org.conscrypt.Platform.blockGuardOnNetwork(Platform.java:424)
       at com.android.org.conscrypt.ConscryptFileDescriptorSocket$SSLOutputStream.write(ConscryptFileDescriptorSocket.java:608)
       at okio.OutputStreamSink.write(JvmOkio.kt:56)
       at okio.AsyncTimeout$sink$1.write(AsyncTimeout.kt:127)
       at okio.RealBufferedSink.flush(RealBufferedSink.kt:268)
       at okhttp3.internal.http2.Http2Writer.rstStream(Http2Writer.kt:144)
       at okhttp3.internal.http2.Http2Connection.writeSynReset$okhttp(Http2Connection.kt:357)
       at okhttp3.internal.http2.Http2Stream.close(Http2Stream.kt:258)
       at okhttp3.internal.http2.Http2Stream.cancelStreamIfNecessary$okhttp(Http2Stream.kt:557)
       at okhttp3.internal.http2.Http2Stream$FramingSource.close(Http2Stream.kt:539)
       at okio.ForwardingSource.close(ForwardingSource.kt:32)
       at okhttp3.internal.connection.Exchange$ResponseBodySource.close(Exchange.kt:324)
       at okio.RealBufferedSource.close(RealBufferedSource.kt:486)
       at aws.smithy.kotlin.runtime.http.engine.okhttp.InstrumentedSource.close(MetricsInterceptor.kt:96)
       at okio.RealBufferedSource.close(RealBufferedSource.kt:486)
       at okhttp3.internal._UtilCommonKt.closeQuietly(-UtilCommon.kt:302)
       at okhttp3.internal._ResponseBodyCommonKt.commonClose(-ResponseBodyCommon.kt:50)
       at okhttp3.ResponseBody.close(ResponseBody.kt:181)
       at aws.smithy.kotlin.runtime.http.engine.okhttp.OkHttpEngine.roundTrip$lambda$2$lambda$1(OkHttpEngine.kt:68)
       at kotlinx.coroutines.InvokeOnCompletion.invoke(JobSupport.kt:1534)
       at kotlinx.coroutines.JobSupport.notifyCompletion(JobSupport.kt:1625)
       at kotlinx.coroutines.JobSupport.completeStateFinalization(JobSupport.kt:316)
       at kotlinx.coroutines.JobSupport.finalizeFinishingState(JobSupport.kt:233)
       at kotlinx.coroutines.JobSupport.tryMakeCompletingSlowPath(JobSupport.kt:946)
       at kotlinx.coroutines.JobSupport.tryMakeCompleting(JobSupport.kt:894)
       at kotlinx.coroutines.JobSupport.cancelMakeCompleting(JobSupport.kt:727)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:698)
       at kotlinx.coroutines.JobSupport.cancelInternal(JobSupport.kt:663)
       at kotlinx.coroutines.JobSupport.cancel(JobSupport.kt:648)
       at aws.smithy.kotlin.runtime.http.engine.CoroutineUtilsKt.attachToOuterJob$lambda$1(CoroutineUtils.kt:51)
       at kotlinx.coroutines.InvokeOnCancelling.invoke(JobSupport.kt:1571)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1604)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:826)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:786)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:702)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:668)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1580)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1604)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:826)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:786)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:702)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:668)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1580)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1604)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:826)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:786)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:702)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:668)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1580)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1604)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:826)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:786)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:702)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:668)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1580)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1604)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:826)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:786)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:702)
       at kotlinx.coroutines.JobSupport.parentCancelled(JobSupport.kt:668)
       at kotlinx.coroutines.ChildHandleNode.invoke(JobSupport.kt:1580)
       at kotlinx.coroutines.JobSupport.notifyCancelling(JobSupport.kt:1604)
       at kotlinx.coroutines.JobSupport.tryMakeCancelling(JobSupport.kt:826)
       at kotlinx.coroutines.JobSupport.makeCancelling(JobSupport.kt:786)
       at kotlinx.coroutines.JobSupport.cancelImpl$kotlinx_coroutines_core(JobSupport.kt:702)
       at kotlinx.coroutines.JobSupport.cancelInternal(JobSupport.kt:663)
       at kotlinx.coroutines.JobSupport.cancel(JobSupport.kt:648)
       at kotlinx.coroutines.flow.internal.ChannelFlowTransformLatest$flowCollect$3$1.emit(Merge.kt:25)
       at kotlinx.coroutines.flow.DistinctFlowImpl$collect$2.emit(Distinct.kt:73)
       at kotlinx.coroutines.flow.FlowKt__TransformKt$filterNotNull$$inlined$unsafeTransform$1$2.emit(Emitters.kt:50)
       at kotlinx.coroutines.flow.FlowKt__DelayKt$debounceInternal$1$3$1.invokeSuspend(Delay.kt:226)
       at kotlinx.coroutines.flow.FlowKt__DelayKt$debounceInternal$1$3$1.invoke(Delay.kt:9)
       at kotlinx.coroutines.flow.FlowKt__DelayKt$debounceInternal$1$3$1.invoke(Delay.kt:3)
       at kotlinx.coroutines.selects.SelectImplementation$ClauseData.invokeBlock(Select.kt:843)
       at kotlinx.coroutines.selects.SelectImplementation.complete(Select.kt:715)
       at kotlinx.coroutines.selects.SelectImplementation.doSelectSuspend(Select.kt:456)
       at kotlinx.coroutines.selects.SelectImplementation.access$doSelectSuspend(Select.kt:251)
       at kotlinx.coroutines.selects.SelectImplementation$doSelectSuspend$1.invokeSuspend(Select.kt:12)
       at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
       at kotlinx.coroutines.DispatchedTaskKt.resume(DispatchedTask.kt:221)
       at kotlinx.coroutines.DispatchedTaskKt.resumeUnconfined(DispatchedTask.kt:177)
       at kotlinx.coroutines.DispatchedTaskKt.dispatch(DispatchedTask.kt:149)
       at kotlinx.coroutines.CancellableContinuationImpl.dispatchResume(CancellableContinuationImpl.kt:470)
       at kotlinx.coroutines.CancellableContinuationImpl.completeResume(CancellableContinuationImpl.kt:591)
       at kotlinx.coroutines.selects.SelectKt.tryResume(Select.kt:870)
       at kotlinx.coroutines.selects.SelectKt.access$tryResume(Select.kt:1)
       at kotlinx.coroutines.selects.SelectImplementation.trySelectInternal(Select.kt:647)
       at kotlinx.coroutines.selects.SelectImplementation.trySelect(Select.kt:624)
       at kotlinx.coroutines.selects.OnTimeout$register$$inlined$Runnable$1.run(Runnable.kt:14)
       at android.os.Handler.handleCallback(Handler.java:883)
       at android.os.Handler.dispatchMessage(Handler.java:100)
       at android.os.Looper.loop(Looper.java:264)
       at android.app.ActivityThread.main(ActivityThread.java:7581)
       at java.lang.reflect.Method.invoke(Method.java)
       at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:492)
       at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:980)

Steps to Reproduce

We could not reproduce this on our end but it is happening for lot of our users specially with not reliable network connection

Possible Solution

There is no indication from our codebase suggesting additional network calls are made outside of our invocation to fetch suggestions with the IO dispatcher. However, it appears that AWS SDK is initiating a separate network call on the main thread.

Context

Since we have recently migrated from Google Places to AWS location service, many of our production users are affected due to this crash

Smithy-Kotlin version

aws.sdk.kotlin:location :1.2.62

Platform (JVM/JS/Native)

Native

Operating system and version

Android 13

0marperez commented 1 week ago

Hi, thanks for opening this issue. This is going to be tricky to reproduce, are you able to get any more information from your customers that are seeing this issue? We'll also take a look at our code to search for any indication of network calls being made on the main thread.