rive-app / rive-android

A runtime for interactive animations on Android
https://rive.app
MIT License
332 stars 30 forks source link

Crash Creating EGL Surface #324

Closed brianwernick closed 2 months ago

brianwernick commented 2 months ago

Description

In our production app we are seeing reports in the Google Play Console and Sentry (3rd party reporting tool) that the app is crashing frequently due to seg faults that are traced back to Rive. Specifically the common source function call is rive_android::EGLThreadState::createEGLSurface

Trace 1
0   libutils.so                     0x7c7c4231d0        std::__1::__hash_table<T>::remove
1   libutils.so                     0x7c7c417edc        android::RefBase::incStrong
2   libGLES_mali.so                 0x79be6527d8        <unknown> + 522885343192
3   libGLES_mali.so                 0x79be666e8c        <unknown> + 522885426828
4   libGLES_mali.so                 0x79be6627bc        <unknown> + 522885408700
5   libGLES_mali.so                 0x79be666c2c        eglCreateWindowSurface
6   libEGL.so                       0x7c9313e700        eglGetFrameTimestampSupportedANDROID
7   libEGL.so                       0x7c9313e5dc        eglGetFrameTimestampSupportedANDROID
8   base.apk                        0x795339b8c4        rive_android::EGLThreadState::createEGLSurface
9   base.apk                        0x79533a3590        rive_android::SkiaWorkerImpl::SkiaWorkerImpl
10  base.apk                        0x79533a29a8        rive_android::WorkerImpl::Make
11  base.apk                        0x79533a2624        <unknown> + 521087362596
12  base.apk                        0x795339cec8        rive_android::WorkerThread::threadMain
Trace 2
0   libc.so                         0x7b5d4fb7d4        abort
1   libc.so                         0x7b5d4fd634        <unknown> + 529846490676
2   libc.so                         0x7b5d564988        <unknown> + 529846913416
3   libc.so                         0x7b5d564808        pthread_mutex_lock
4   libc++.so                       0x7b4c55feb0        std::__1::mutex::lock
5   libc++.so                       0x7b4c560da4        std::__1::__shared_mutex_base::lock_shared
6   libgui.so                       0x7b423215bc        android::Surface::hook_query
7   eglSubDriverAndroid.so          0x781d89e0b0        GetSubDriverVersion
8   eglSubDriverAndroid.so          0x781d8975c8        <unknown> + 515891623368
9   libGLESv2_adreno.so             0x7807a613bc        InitEsxProfile
10  libEGL.so                       0x7b3cfc3714        eglGetFrameTimestampSupportedANDROID
11  libEGL.so                       0x7b3cfc35e4        eglGetFrameTimestampSupportedANDROID
12  libEGL.so                       0x7b3cfc15d4        eglCreateWindowSurface
13  base.apk                        0x779d0ae8c4        rive_android::EGLThreadState::createEGLSurface
14  base.apk                        0x779d0b6590        rive_android::SkiaWorkerImpl::SkiaWorkerImpl
15  base.apk                        0x779d0b59a8        rive_android::WorkerImpl::Make
16  base.apk                        0x779d0b5624        <unknown> + 513735874084
17  base.apk                        0x779d0afec8        rive_android::WorkerThread::threadMain

Provide a Repro

Not necessarily repro steps, however we have the RiveAnimationView wrapped for Compose and it seems to be occurring when we transition from one "screen" (full screen Composable) to another where both have RiveAnimationViews that use the same rive animation asset.

AndroidView(
    factory = { context ->
        RiveAnimationView(context)
    }
) { animationView ->
    animationView.setRiveResource(
        resId = animationResId
    )
}

Expected behavior

The RiveAnimationView doesn't cause a seg fault / crash

Device & Versions (please complete the following information)

Additional context

umberto-sonnino commented 2 months ago

Hi @brianwernick thanks for the report!

Not necessarily repro steps, however we have the RiveAnimationView wrapped for Compose and it seems to be occurring when we transition from one "screen" (full screen Composable) to another where both have RiveAnimationViews that use the same rive animation asset.

I'd like to take a look at this setup, can you share a sample project with the transition you describe?

brianwernick commented 2 months ago

@umberto-sonnino I've setup a sample app that has a Composable named Issue324 which shows the setup for the screen and transitions. If you want to run this on a device you can swap out the SimpleExample() call in the MainActivity with Issue324()

brianwernick commented 2 months ago

@umberto-sonnino I've added a second Composable named Issue324V2 that includes updates based on our updated understanding of where the crash is coming from after one of our engineers encountered the issue at a different state in the screen transition. With this version we've been able to consistently reproduce the crash (takes up to a minute).

In our application when we transition these two screens, the first screen performs a size animation on the RiveAnimationView which isn't ideal since changing the size of a Composable will trigger recomposition, however the AndroidView wrapper is keeping the RiveAnimationView around so I didn't worry about it too much and was planning on cleaning it up later. However, reviewing the tombstones above and others leads me to believe that the underlying surface texture used by the TextureView is being replaced by a new one that's sized to the new view's size. With this happening at the screens refresh rate (e.g 60/90/120 times a second) means that we are likely hitting some boundary with memory, or timing.

This leads me to 2 conclusions:

  1. This is something that we can solve ourselves by switching to a graphics layer scale of the RiveAnimationView so that we aren't resizing the view as frequently (the animation runs for 250ms).
  2. It's worth validating the hypothesis that the TextureView is creating a new surface each size update. It's been a while since I dug into the TextureView surface buffer, maybe this issue is coming from the Skia/GLES handling used by Rive.
umberto-sonnino commented 2 months ago

Hey @brianwernick , thanks again to you and your team, this clearly repros with the repo you shared! I'm looking into a fix, we might be able to optimize a couple of things that should allow you to still use the resizing animation

brianwernick commented 2 months ago

I'm going to mark this as resolved (close) since 9.3.1 seems to resolve the crash that can be reproduced. Additionally we have updated our app to use a graphics layer scale for short-lived size changes which should also avoid the mass amount of resizes. @umberto-sonnino Thanks for jumping on this and getting a resolution out quickly

umberto-sonnino commented 2 months ago

I'm going to mark this as resolved (close) since 9.3.1 seems to resolve the crash that can be reproduced. Additionally we have updated our app to use a graphics layer scale for short-lived size changes which should also avoid the mass amount of resizes. @umberto-sonnino Thanks for jumping on this and getting a resolution out quickly

Thank you @brianwernick and your team for providing us with a repro - that made everything easier!

evelant commented 1 month ago

We've continued to see this crash in rive-react-native 7.0.2 so I'm not sure this is completely fixed

Waltari10 commented 1 month ago

Our Android app is throwing this error quite frequently with react-native-rive version 7.0.2.

[split_config.arm64_v8a.apk!librive-android.so] rive_android::EGLThreadState::createEGLSurface(ANativeWindow*)

umberto-sonnino commented 1 month ago

Could you share a repro of this crash with one of your setups @evelant or @Waltari10? This should've been fixed in the latest Android version

evelant commented 1 month ago

Unfortunately we can't reproduce the crash locally. It seems to happen at random to users in production.

evelant commented 1 month ago

@umberto-sonnino This continues to happen with rive-react-native 7.0.4. Random crashes in production. Seems to be affecting a sizeable percentage of our users ☹️

OS Version: Android 11 (RPXS31.Q2-58-17-4-8)
Report Version: 104

Exception Type: Unknown (SIGABRT)

Application Specific Information:
Abort

Thread 0 Crashed:
0   libc.so                         0x7ee169930c        abort
1   libc.so                         0x7ee16fc348        <unknown> + 544948077384
2   libc.so                         0x7ee16fb944        <unknown> + 544948074820
3   libc.so                         0x7ee16fb79c        pthread_mutex_lock
4   libc++.so                       0x7edf2df3a8        std::__1::mutex::lock
5   libc++.so                       0x7edf2e017c        std::__1::__shared_mutex_base::lock_shared
6   libgui.so                       0x7ee29b940c        android::Surface::hook_query
7   libEGL.so                       0x7eddf5f7ac        eglGetFrameTimestampSupportedANDROID
8   libEGL.so                       0x7eddf5f71c        eglGetFrameTimestampSupportedANDROID
9   split_config.arm64_v8a.apk      0x7bdf3040ac        rive_android::EGLThreadState::createEGLSurface
10  split_config.arm64_v8a.apk      0x7bdf30be60        rive_android::SkiaWorkerImpl::SkiaWorkerImpl
11  split_config.arm64_v8a.apk      0x7bdf30b1f4        rive_android::WorkerImpl::Make
12  split_config.arm64_v8a.apk      0x7bdf30ae50        <unknown> + 532025486928
13  split_config.arm64_v8a.apk      0x7bdf3056b0        rive_android::WorkerThread::threadMain
14  split_config.arm64_v8a.apk      0x7bdf30548c        <unknown> + 532025463948
15  libc.so                         0x7ee16fac6c        <unknown> + 544948071532
16  libc.so                         0x7ee169b2c8        <unknown> + 544947679944

EOF
umberto-sonnino commented 1 month ago

Unfortunately we can't reproduce the crash locally. It seems to happen at random to users in production.

Any particular way you're using Rive inside the app? Any chance it's being used inside a transition or something of the like? If you have something like that that you could share it could help us narrow things down

evelant commented 1 month ago

Our app doesn't use any transitions between native views. The only thing we do is show/hide (render or not render) regular react native views. The Rive animations we have live at the root component of the app so they never get destroyed/moved/whatever after the app boots. The .riv files can be found here:

https://ymugbdbublahxiheenub.supabase.co/storage/v1/object/public/test/confettithv4.riv https://ymugbdbublahxiheenub.supabase.co/storage/v1/object/public/test/taskherofabv5.riv

evelant commented 1 month ago

One possible thing of note -- we use react-native-skia in our app. I'd assume that's a completely separate skia instance from the one inside Rive but I thought it worth mentioning in case some sort of conflict is possible.