livefront / bridge

An Android library for avoiding TransactionTooLargeException during state saving and restoration
Apache License 2.0
257 stars 25 forks source link

ANR on saveInstanceState #76

Open a-blekot opened 1 year ago

a-blekot commented 1 year ago

Hi! We have ANRs in our Crashlytics. All devices running Android 12 and 13. 66% of them are Samsung devices ))

Could you please help to find the reason of these ANRs )

main (timed waiting):tid=1 systid=25492 
       at jdk.internal.misc.Unsafe.park(Unsafe.java)
       at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
       at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1079)
       at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1369)
       at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:278)
       at com.livefront.bridge.BridgeDelegate.queueDiskWritingIfNecessary(BridgeDelegate.java:229)
       at com.livefront.bridge.BridgeDelegate.saveToMemoryAndDiskIfNecessary(BridgeDelegate.java:380)
       at com.livefront.bridge.BridgeDelegate.saveInstanceState(BridgeDelegate.java:348)
       at com.livefront.bridge.Bridge.saveInstanceState(Bridge.java:145)

systid=25492- it's different in each case. Other parts of stack-trace are the same.

we use following code to init Bridge

private fun initBridge() {
        Bridge.initialize(this, object : SavedStateHandler {
            override fun saveInstanceState(target: Any, state: Bundle) {
                (target as? StateSaver)?.saveState(state)
            }

            override fun restoreInstanceState(target: Any, state: Bundle?) {
                (target as? StateSaver)?.restoreState(state)
            }
        })
    }

Where StateSaver is interface which is implemented by our Fragments

// BaseFragment
override fun onSaveInstanceState(outState: Bundle) {
    Bridge.saveInstanceState(this, outState)
}
byencho commented 1 year ago

@a-blekot In order to ensure that all the desired saved state gets saved before the OS can kill an app, Bridge blocks (with a timeout) until all the data has been written to disk. The timeout only applies to the writing of each individual Bundle, though. So if there is a large amount of data spread out across many different Bundles, I can see how this could lead to ANRs. In a sense what Bridge is doing is trading a very high risk of a TransactionTooLargeException with a much lower risk of ANRs for large amounts of data. So that's a long way of saying that I'm not surprised that you would see these errors.

I've considered some options in the past for what to do about this. The best I've come up with is to support a setting that disables Bridge from blocking when writing its data. It would do its work on a background thread but the cost of this is that if the OS wants to kill the app while it's doing it, that will just happen and there will be incomplete saved state data when the app is restored. If that is of interest to you that's a feature I can look into supporting. Otherwise my advice would be to just look into other ways to reduce the amount of state you are trying to save to a Bundle. For example, I'd recommend manually persisting any data to disk that you can using Room rather than trying to place it in a Bundle and reserve the Bundle state for smaller bits of data that wouldn't make sense in a database.

a-blekot commented 1 year ago

@byencho thank you for quick feedback 🙂

I'm new dev in the current project, so I need some time to investigate which part of state must be preserved during configuration changes (or other reason when onSaveInstanceState gets called).

Maybe you could give some advice how to identify big state objects) What comes to my mind, is to measure Bundle objects when they are actually going to be saved. And save this info to analyze it later.

Mostly we have some simple primitives in our screen's State. I think only String objects could take a lot of space...

P.s. I agree that ANR is better than TransactionTooLargeException )) maybe we could handle it with ForegroundService? Or service with notification would not help in this case?

byencho commented 1 year ago

@a-blekot You can use something like TooLargeTool to try and monitor the data going in the Bundles. In your code, though, I'd look to see if you are saving any List<T> or Bitmap or something like that. Those can become large.

As for the Foreground service idea, that might be tricky. You wouldn't want every app using Bridge to have to display that kind of notification. Another idea is that we could try to offload the processing to something like WorkManager but then the jobs might not process before you'd need to read the Bundles again, which would not be great. There's not a ton of great options unfortunately. I could try to investigate more at some point though.

a-blekot commented 1 year ago

Thanks! I will try this tool tomorrow! Hope we don't save any bitmaps 🥲

Interesting detail is that all ANRs are on Android 12+. Maybe changes to file system restrictions affects saving data by Bridge?

byencho commented 1 year ago

@a-blekot That is an interesting detail. I'd have to look into that.

a-blekot commented 1 year ago

@byencho little update on this https://github.com/firebase/firebase-android-sdk/issues/3705#issuecomment-1120864786

ANR collection starts only on Crashlytics SDK version 18.2.4 ANR collection only works for devices with Android version 11+.

But in our case we see this ANRs for Android 12+, so maybe it has some connection)

byencho commented 1 year ago

OK good to know, thanks @a-blekot!