element-hq / element-x-android

Android Matrix messenger application using the Matrix Rust Sdk and Jetpack Compose
GNU Affero General Public License v3.0
1.09k stars 156 forks source link

fix : protect some usages of client to avoid crashes #3886

Closed bmarty closed 14 hours ago

bmarty commented 4 days ago

Fixes https://github.com/element-hq/element-x-android-rageshakes/issues/3375 and other rageshakes.

github-actions[bot] commented 4 days ago

:iphone: Scan the QR code below to install the build (arm64 only) for this PR. QR code If you can't scan the QR code you can install the build via this link: https://i.diawi.com/TFQFgM

codecov[bot] commented 4 days ago

Codecov Report

Attention: Patch coverage is 56.52174% with 10 lines in your changes missing coverage. Please review.

Project coverage is 82.97%. Comparing base (e76f7fb) to head (0261739). Report is 5 commits behind head on develop.

Files with missing lines Patch % Lines
...ndroid/features/roomlist/impl/RoomListPresenter.kt 25.00% 5 Missing and 1 partial :warning:
...ement/android/appnav/loggedin/LoggedInPresenter.kt 62.50% 1 Missing and 2 partials :warning:
...id/libraries/matrix/api/sync/SlidingSyncVersion.kt 66.66% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #3886 +/- ## =========================================== - Coverage 82.98% 82.97% -0.02% =========================================== Files 1788 1789 +1 Lines 45208 45222 +14 Branches 5334 5337 +3 =========================================== + Hits 37516 37522 +6 - Misses 5818 5825 +7 - Partials 1874 1875 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.


🚨 Try these New Features:

ganfra commented 4 days ago

Mmmm, this is weird, this means there is an old client still referenced?

bmarty commented 4 days ago

Mmmm, this is weird, this means there is an old client still referenced?

Nothing is done to stop SendQueues. This is maybe another way to fix the issue?

Edit actually the scope should be destroyed, so this should work

ganfra commented 4 days ago

Mmmm, this is weird, this means there is an old client still referenced?

Nothing is done to stop SendQueues. This is maybe another way to fix the issue?

It's bound to the LoggedInFlowNode lifecycle, so this should be stopped at this point :/

jmartinesp commented 3 days ago

According to the logs, the user was signed out but the app wasn't killed, so when the app was opened again it tried restoring the saved navigation state, which I guess either returned a new instance or retained one of our MatrixClient wrappers and then failed when it tried using the underlying Client:

2024-11-16T14:01:24.646079Z ERROR elementx: [Push/Notification/DefaultNotifiableEventResolver] Unable to resolve event: ----.
java.lang.IllegalStateException: NotificationClient object has already been destroyed
    at org.matrix.rustcomponents.sdk.NotificationClient.getNotification(SourceFile:147)
    at io.element.android.libraries.matrix.impl.notification.RustNotificationService$getNotification$2.invokeSuspend(SourceFile:55)
    at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(SourceFile:9)
    at kotlinx.coroutines.DispatchedTask.run(SourceFile:107)
    at com.google.android.gms.tasks.zzc.run(SourceFile:49)
    at kotlinx.coroutines.scheduling.TaskImpl.run(SourceFile:3)
    at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(SourceFile:93)
 | 
2024-11-16T14:01:24.647532Z  WARN elementx: [Push/PushHandler] Unable to get a notification data | 
2024-11-16T14:02:31.132845Z  WARN elementx: [MainActivity] onCreate, with savedInstanceState: true | 
2024-11-16T14:02:31.158780Z  WARN elementx: [MainActivity] onNewIntent | 
2024-11-16T14:02:31.159743Z  WARN elementx: [MainActivity] onResume | 
2024-11-16T14:02:31.223778Z DEBUG elementx: Current app migration version: 7. No migration needed. | 
2024-11-16T14:02:31.240437Z  WARN elementx: [MainActivity] onMainNodeInit | 
2024-11-16T14:02:31.241957Z DEBUG elementx: Restore state | 
2024-11-16T14:02:31.242295Z  WARN elementx: Restore with non-empty map | 
2024-11-16T14:02:31.242762Z DEBUG elementx: [Lifecycle] onCreate RootFlowNode | 
2024-11-16T14:02:31.246803Z DEBUG elementx: [Lifecycle] onCreate LoggedInFlowNode | 
2024-11-16T14:02:31.247212Z DEBUG elementx: [Navigation] Navigating to session ---, Current state: Root | 
2024-11-16T14:02:31.247596Z DEBUG elementx: [Navigation] Navigating to space !mainSpace:local. Current state: Session(owner=---, sessionId=---) | 
2024-11-16T14:02:31.248198Z DEBUG elementx: [Lifecycle] onCreate LoggedInAppScopeFlowNode | 
2024-11-16T14:02:31.248975Z DEBUG elementx: Navigating to Room(sessionId=---, roomId=---, threadId=null) | 
2024-11-16T14:02:31.252992Z DEBUG elementx: [Lifecycle] onCreate RoomListNode | 
2024-11-16T14:02:31.253392Z  INFO elementx: setAllSendQueuesEnabled(true) | 
2024-11-16T14:02:31.253413Z DEBUG elementx: [Lifecycle] onCreate LoggedInNode | 
2024-11-16T14:02:31.255261Z DEBUG elementx: [Lifecycle] onCreate RoomFlowNode | 
2024-11-16T14:02:31.256300Z DEBUG elementx: [Lifecycle] onCreate ComposableNode | 
2024-11-16T14:02:31.257910Z DEBUG elementx: Room membership: JOINED | 
2024-11-16T14:02:31.259225Z DEBUG elementx: [Lifecycle] onCreate JoinedRoomFlowNode | 
2024-11-16T14:02:31.260418Z DEBUG elementx: [Lifecycle] onCreate ComposableNode | 
2024-11-16T14:02:31.261355Z DEBUG elementx: Room factory is destroyed, returning null for --- | 
2024-11-16T14:02:31.262613Z DEBUG elementx: Save matrix session keys = [---] | 
2024-11-16T14:02:32.749333Z TRACE elementx: Uncaught exception: java.lang.IllegalStateException: Client object has already been destroyed | 
ganfra commented 3 days ago

So there is probably a scenario where MatrixClientsHolder is not updated correctly

ganfra commented 2 days ago

I've made more investigations on this, and there are some conditions where, while replacing the LoggedInFlowNode, we can get several calls to a destroyed client, the time the node get removed from the composition.

So we have 2 choices :

This crash has also emphasised another issue with the refresh token, we need more logs (https://github.com/matrix-org/matrix-rust-sdk/pull/4301)

Edit : the issue with the refresh token might have been fixed https://github.com/matrix-org/matrix-rust-sdk/pull/4304

sonarcloud[bot] commented 17 hours ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud