realm / realm-dart

Realm is a mobile database: a replacement for SQLite & ORMs.
Apache License 2.0
756 stars 84 forks source link

MultipleSyncAgents: Multiple sync agents attempted to join the same session #1708

Open Navil opened 3 months ago

Navil commented 3 months ago

What happened?

I am seeing crashes for my users related to realm. I do not know how to reproduce it, but is is a crash that happens often enough to push my app above the Android Store threshold. This makes this topic relatively import for us.

image

Repro steps

-

Version

Flutter - Channel stable, 3.22.1

What Atlas Services are you using?

Both Atlas Device Sync and Atlas App Services

What type of application is this?

Flutter Application

Client OS and version

e.g. Redmi gale (Redmi 13C) Android 13 (SDK 33)

Code snippets

No response

Stacktrace of the exception/crash you're getting

terminating due to uncaught exception of type realm::MultipleSyncAgents: Multiple sync agents attempted to join the same session

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
pid: 0, tid: 16329 >>> at.navil.aupair <<<

backtrace:
  #00  pc 0x000000000008ce34  /apex/com.android.runtime/lib64/bionic/libc.so (abort+164)
  #01  pc 0x0000000000be1364  /data/app/~~vukLyGJKm3ZahZDwjFA-IQ==/at.navil.aupair-2PjF-p25ZVY8Q72AyUDHvg==/split_config.arm64_v8a.apk!librealm_dart.so (BuildId: 3d1db96d07fa0c70b9d537bf8d1503b9fda596d8)
  #02  pc 0x0000000000be0c58  /data/app/~~vukLyGJKm3ZahZDwjFA-IQ==/at.navil.aupair-2PjF-p25ZVY8Q72AyUDHvg==/split_config.arm64_v8a.apk!librealm_dart.so (BuildId: 3d1db96d07fa0c70b9d537bf8d1503b9fda596d8)
  #03  pc 0x0000000000be0b14  /data/app/~~vukLyGJKm3ZahZDwjFA-IQ==/at.navil.aupair-2PjF-p25ZVY8Q72AyUDHvg==/split_config.arm64_v8a.apk!librealm_dart.so (BuildId: 3d1db96d07fa0c70b9d537bf8d1503b9fda596d8)
  #04  pc 0x0000000000bfa0d4  /data/app/~~vukLyGJKm3ZahZDwjFA-IQ==/at.navil.aupair-2PjF-p25ZVY8Q72AyUDHvg==/split_config.arm64_v8a.apk!librealm_dart.so (__cxa_rethrow+220) (BuildId: 3d1db96d07fa0c70b9d537bf8d1503b9fda596d8)
  #05  pc 0x00000000006b94e0  /data/app/~~vukLyGJKm3ZahZDwjFA-IQ==/at.navil.aupair-2PjF-p25ZVY8Q72AyUDHvg==/split_config.arm64_v8a.apk!librealm_dart.so (BuildId: 3d1db96d07fa0c70b9d537bf8d1503b9fda596d8)
  #06  pc 0x00000000006c3dc0  /data/app/~~vukLyGJKm3ZahZDwjFA-IQ==/at.navil.aupair-2PjF-p25ZVY8Q72AyUDHvg==/split_config.arm64_v8a.apk!librealm_dart.so (BuildId: 3d1db96d07fa0c70b9d537bf8d1503b9fda596d8)
  #07  pc 0x00000000006b8b24  /data/app/~~vukLyGJKm3ZahZDwjFA-IQ==/at.navil.aupair-2PjF-p25ZVY8Q72AyUDHvg==/split_config.arm64_v8a.apk!librealm_dart.so (BuildId: 3d1db96d07fa0c70b9d537bf8d1503b9fda596d8)
  #08  pc 0x00000000006b9d88  /data/app/~~vukLyGJKm3ZahZDwjFA-IQ==/at.navil.aupair-2PjF-p25ZVY8Q72AyUDHvg==/split_config.arm64_v8a.apk!librealm_dart.so (BuildId: 3d1db96d07fa0c70b9d537bf8d1503b9fda596d8)
  #09  pc 0x00000000000fbbcc  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+204)
  #10  pc 0x000000000008e670  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64)

Relevant log output

No response

sync-by-unito[bot] commented 3 months ago

➤ PM Bot commented:

Jira ticket: RDART-1049

nirinchev commented 3 months ago

This exception gets thrown when you have multiple processes trying to open the same synchronized realm. This is not yet supported, though we do have work ongoing to enable it. My best guess here would be that you have a widget or a notification handler that's trying to open the Realm while the main app is also using it.

Navil commented 3 months ago

Thanks for the response. This is not the case in my app. I am just opening the realm once my app opens. Are you suggesting that I might call the Realm() constructor more than once?

nirinchev commented 3 months ago

Calling it in the same app/process multiple times is totally fine. If your app doesn't offer any multiprocess support, it's not obvious to me why you'd get that error. I'll need to check in with our android experts to see if they have an idea what could be going on.

Navil commented 3 months ago

@nirinchev I just remembered that there also was a similar ticket in realm-core (https://github.com/realm/realm-core/issues/7190). Not sure if it helps

serhii-k commented 2 months ago

Hello @Navil Can you please do the following: 1) Start your app, let it load 2) Close your app with the "Back" button or the corresponding gesture 3) Immediately start your app again (e.g. by pressing the app icon on the homescreen)

Does it crash on the second open? If so, please check the debug logs to see if there was a MultipleSyncAgents exception.

Navil commented 2 months ago

Hey @serhii-k. My app does not close when I press the "Back" button. However I tried the following things:

  1. Multitask away and directily click the app icon
  2. Press the home button and directly click the app icon
  3. Close the app from the multi-tasking and directly click the app icon

None of those led to the issue

tanguy-penfen commented 2 months ago

Hello, I recently encountered this error while developping on a realme 8 phone. It happens any time I hot-reload flutter while the phone doesn't have internet connection. It doesn't happen when it does have internet and is normally syncing.

serhii-k commented 2 months ago

Thank you @Navil for trying.

I see this error in a very simple usecase. I'm creating a flexible sync realm at the beginning of the main() function, and immediately closing this realm (so there shouldn't be any process connected to the realm). Then the app simply shows an empty scaffold.

If I close this app with a "Back" button gesture, on the second run the app crashes with this error. I've tested on emulators with SDK 33 and 34.

As well, as @tanguy-penfen has mentioned, this error is easily reproducible with the hot-reload feature. But in my case even with the internet connection.

However, if I'm adding a realm.subscriptions.update(), followed by a call to waitForSynchronization(), this error doesn't occur on the second run.

Here is a code for someone who can investigate this issue:

import 'package:flutter/material.dart';
import 'package:realm/realm.dart';
import 'package:realm_test/model/model.dart';

Future<void> main() async {
  WidgetsFlutterBinding.ensureInitialized();

  // Create the app.
  // Obtain the current user.

  final config = Configuration.flexibleSync(
    user,
    [Point.schema],
  );
  final realm = Realm(config);

  // If uncommented, there is no MultipleSyncAgents error.
  // realm.subscriptions.update((mutableSubscriptions) {});
  // await realm.subscriptions.waitForSynchronization();

  realm.close();

  runApp(const MaterialApp(home: Scaffold()));
}
nielsenko commented 2 months ago

@serhii-k Could I ask you to try building the app with this pattern:

final app = App.getById(appId) ?? App(AppConfiguration(appId, ...));

and report back.

tanguy-penfen commented 2 months ago

@nielsenko In my case the issue disappeared with your pattern.

nielsenko commented 2 months ago

@tanguy-penfen Can you add a print of the current isolate hash code just before? Ie.

print(Isolate.current.hashCode);
final app = App.getById(appId) ?? App(AppConfiguration(appId, ...));

and report the values traced?

tanguy-penfen commented 2 months ago

@nielsenko I'm sorry I am not sure of what you are asking. This value is updated at each hot reload.

serhii-k commented 2 months ago

@serhii-k Could I ask you to try building the app with this pattern:

final app = App.getById(appId) ?? App(AppConfiguration(appId, ...));

and report back.

@nielsenko Awesome! This pattern actually worked great. I couldn't reproduce the error any more.

As for the isolate hash, on my part it also changes for every run. Here are some values:

802758700
217320362
144389734
497655921
69002942
nielsenko commented 2 months ago

Thanks for confirming that the isolate changes. I would expect the same.

BTW, be aware that the trick of calling App.getById first is not really a good solution, as the cached app on the native side holds on to an object from the previous isolate (the HttpClient), and once that is needed (it is used during authentication) we are bound for trouble.

We will need some time to replicate and discuss this.

serhii-k commented 1 month ago

It seems to be related not only to Android. I've started testing Realm on an iOS device, and by restarting a simple Flutter project, the same issue arises (I'm pressing Ctrl+R in Android Studio while the app is already running).

It only happens with this setup: final app = App(AppConfiguration(appId));

The simple code is mentioned in this comment.

And the exception:

libc++abi: terminating with uncaught exception of type realm::MultipleSyncAgents: Multiple sync agents attempted to join the same session
* thread #32, stop reason = signal SIGABRT
    frame #0: 0x00000001bb854bbc libsystem_kernel.dylib`__pthread_kill + 8
libsystem_kernel.dylib`:
->  0x1bb854bbc <+8>:  b.lo   0x1bb854bd8               ; <+36>
    0x1bb854bc0 <+12>: stp    x29, x30, [sp, #-0x10]!
    0x1bb854bc4 <+16>: mov    x29, sp
    0x1bb854bc8 <+20>: bl     0x1bb85060c               ; cerror_nocancel
Target 0: (Runner) stopped.
dotjon0 commented 1 month ago

We have seen instances of this error too on hot restart (SHIFT + R) - will copy the stack trace when we see it again. We have to kill the app and 'flutter run' to start up from scratch to get past this error.

mcmah309 commented 1 month ago

Also get this error on hot restart.

final app = App.getById(appId) ?? App(AppConfiguration(appId));

Seems to fix it.

Navil commented 1 month ago

This is however not a good solution as mentioned before. But it might be better than the app crashing. @nielsenko Do you know what kind of troubles we will run into using the workaround?

nielsenko commented 1 month ago

The problem is that the Client passed with AppConfiguration (either explicitly or implicitly) belongs to the old isolate, but is still the one used by realm-core when doing auth related stuff, after the restart. It doesn't happen often, but will eventually when it needs to refresh tokens. Hence I suspect trouble.

Remember only the main isolate, not the process dies. Either when exiting and entering an app on Android, or hot-restarting the app on any platform (which is of course only happens for a developer, not a user). In particular there is a lot of cached stuff that still exists in core, which is why we end up here.

My work-around might actually work if the old isolate can be kept alive under the hood, so that core can still call into it, but that will just be an ugly work-around, and I'm not sure it is even possible.

The correct way to fix this is in core. I know they have been looking into it, but no fix exists yet. There is a lot of ongoing work regarding smarter multiprocess sync that is overlapping with this.

nielsenko commented 1 month ago

@Navil That said, I'm fairly certain you can lower the frequency of crashes with my work-around. This might bring you on the right side of the Android Store threshold, which I assume has value in itself.

nielsenko commented 1 month ago

I spend some time today, investigating when this regression was introduced. It turns out it was introduced with realm 2.1.0, which uses realm-core 14.5.2.

The actual commit in realm-core is: https://github.com/realm/realm-core/commit/255cb33d838122e23fae4685dc253d708aac435e

Now that we have pinpointed when the issue was introduced I suspect a fix will land soon, but if you feel the need you can downgrade to 2.0.0.

Due to a breaking change in sane_uuid 1.0.0 that is not reflected in the version number, it is slightly complicated., but you can do it by adding the following dependency overrides to your pubsepc.yaml:

dependency_overrides:
  sane_uuid: 1.0.0-alpha.5
  ejson: 0.3.0
  realm: 2.0.0
  realm_common: 2.0.0
  realm_dart: 2.0.0
  realm_generator: 2.0.0

Also, don't use a newer version of Dart than 3.4.4 (ie. Flutter 3.22.3), as Dart 3.5 removed the deprecated UnmodifiableByteBufferView (see https://github.com/dart-lang/sdk/blob/main/CHANGELOG.md#darttyped_data)

nielsenko commented 1 month ago

Related to https://github.com/realm/realm-core/issues/7190

elijahloola commented 2 days ago
final app = App.getById(appId) ?? App(AppConfiguration(appId, ...));

Can you please illustrate with an example? I have the same problem. This could be the solution but I don't really understand how to do it. Thank you