stripe / stripe-terminal-android

Stripe Terminal Android SDK
https://stripe.dev/stripe-terminal-android/
Other
93 stars 45 forks source link

OOM crash in ~30 seconds after Terminal.initTerminal invocation #436

Closed SergeyVinyar closed 7 months ago

SergeyVinyar commented 8 months ago

Summary

For our unpublished yet app we've upgraded the SDK from 2.16.0-b1 to 3.2.1 and started registering crashes in ~30 seconds after Terminal.initTerminal(...) invocation with a stacktrace

Fatal Exception: java.lang.OutOfMemoryError: Failed to allocate a 1445090328 byte allocation with 11213247 free bytes and 181MB until OOM, target footprint 22426495, growth limit 201326592
       at com.squareup.tape2.QueueFile$ElementIterator.next(QueueFile.java:549)
       at com.squareup.tape2.QueueFile$ElementIterator.next(QueueFile.java:514)
       at com.stripe.jvmcore.batchdispatcher.collectors.QueueFileCollector$peek$2.invokeSuspend(QueueFileCollector.kt:1855)
       at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
       at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:108)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
       at java.lang.Thread.run(Thread.java:923)

Often we can make a successful test payment with a card reader but sometimes the app falls into a loop:

  1. A user starts the app
  2. The app invokes Terminal.initTerminal(...) in Application.onCreate
  3. After 30 seconds - crash (no other Stripe SDK methods are invoked)

And it repeats until the app is removed and installed again. All exceptions in the loop have the same amount of bytes for allocation (1445090328 in this example).

Code to reproduce

Unfortunately, we couldn't find a scenario to reproduce it.

Android version

11, 12

Impacted devices (Android devices or readers)

Galaxy Tab A7 Lite Galaxy Tab S6 Lite

SDK version

3.2.1 Also when we got into this crash loop, we tried to downgrade the version to 3.0.0 without removing the app from the device, and it still crashed.

Other information

I'm sure this is the same problem as in https://github.com/stripe/stripe-terminal-android/issues/434 but in our case we didn't use NFC. Galaxy Tab A7 Lite doesn't have an NFC reader at all.

I suspect the reason is that https://github.com/square/tape/blob/master/tape/src/main/java/com/squareup/tape2/QueueFile.java is not thread safe but the stacktrace includes coroutines which may freely change threads for each suspend method. And the temporary file that QueueFile creates just gets corrupted sometimes.

We invoke:

jasells commented 8 months ago

I'm sure this is the same problem as in https://github.com/stripe/stripe-terminal-android/issues/434 but in our case we didn't use NFC. Galaxy Tab A7 Lite doesn't have an NFC reader at all.

We have seen it at times with BT, but thought it was a dependency issue, as after updating some of the other packages, it seemed to go away.

This thread from the tape2 repo seems to indicate is a multi-thread-contention issue, I think? So why it comes and goes between builds.. I'm not sure.

@SergeyVinyar Have you succeeded in creating a simpler repro app?

jasells commented 8 months ago

A user starts the app The app invokes Terminal.initTerminal(...) in Application.onCreate After 30 seconds - crash (no other Stripe SDK methods are invoked)

That also fits with #434, as we are calling Terminal.initTerminal(...) and then immediately connecting the NFC reader, so it looks like it is just the timing of ~30 sec after initTerminal()?

SergeyVinyar commented 8 months ago

@SergeyVinyar Have you succeeded in creating a simpler repro app?

No, it's sporadical and also an app needs a token from the backend to make a payment (and it looks like without a payment we don't get into this issue). I cannot just make a small app that shows the bug. I think that if we place all SDK methods calls into coroutines with the same single-thread dispatcher, it may help. But I haven't tried yet.

SergeyVinyar commented 8 months ago

Something like:

val Dispatchers.Stripe: CoroutineDispatcher by lazy {
    Executors.newSingleThreadExecutor().asCoroutineDispatcher()
}

and then:

GlobalScoupe.launch(Dispatchers.Stripe) {
   Terminal.initTerminal(...)
}
jasells commented 8 months ago

That sounds like a good idea... I'll try that too, as soon as i get some time. I'll post any updates to my issue thread, and I'll watch for anything from you in case you get to it first.

ugochukwu-stripe commented 7 months ago

Hi @SergeyVinyar , we rolled out a fix to mitigate this problem in 3.3.1, could you retry with the latest release and let us know if you're still see this error.

SergeyVinyar commented 7 months ago

Hi @ugochukwu-stripe, thank you a lot. I'll check

SergeyVinyar commented 7 months ago

We've done smoke testing. Everything works. Thank you!