realm / realm-core

Core database component for the Realm Mobile Database SDKs
https://realm.io
Apache License 2.0
1.02k stars 165 forks source link

Async Open should track when it is fully bootstrapped #6734

Open cmelchior opened 1 year ago

cmelchior commented 1 year ago

Async Open of synchronized Realms is an optimization that allows us to download and bootstrap a Realm the first time it is opened. Unfortunately, it behaves in a way that makes it hard for SDKs to use.

Some SDKs (at least Java, Kotlin, and Swift) use the presence of a Realm file as an indicator of whether or not to actually use async open, but this check doesn't really work if there is a crash while async open is running and the app is restarted since it seems that a Realm file will be created as soon as async open starts. This results in the following sequence of events:

  1. Open the app the first time.
  2. Check that no Realm file exists and start opening the Realm using Async Open.
  3. This will create the Realm file.
  4. The app crashes, interrupting Async Open. The file is still present
  5. The app is restarted.
  6. We check that a Realm file exists and open it again normally, but it is still empty.
  7. This means that all data from the server will now be downloaded using normal operational transform, which can lead to delays of 10+ minutes if the amount of initial data is large.

Wanted solution: Some way for SDKs to check if Async Open was interrupted and restart it if possible.

Suggestions (others probably exist):

  1. Download the initial data into a differently named file and then swap it into place once the download is done. Perhaps appending ".downloading" to the file name.
  2. Have some metadata inside the Realm file that tracks whether or not async open was in progress and allow SDKs to react to it.

Workaround: Currently, we advise users on Android to track the state in SharedPreferences, i.e. set a flag when opening the Realm for the first time, and if the app is restarted with this flag set, delete the Realm file and restart the download, but obviously, this not only requires a small amount of boilerplate, it can also cause the same data to be downloaded multiple times.

tgoyne commented 1 year ago

We can probably use the existing sync progress information for this. We defer sending any upload messages until the initial download is complete, so progress_upload_client_version == 0 might work as a check to see if the initial download has ever completed.

danieltabacaru commented 1 year ago

That check is not 100% reliable. Download completion is marked by a MARK message, which usually is received before sending any UPLOAD message (it will also have to be ACK'd to persist between sessions). If the app is closed immediately after the MARK message is received, next time the app starts it will open the realm using async open (instead of normally). But I think that's acceptable.

tgoyne commented 1 year ago

I think it's fine if we mark an async open as successfully completed very slightly later in the process than is strictly correct. Is it guaranteed that we will always eventually set the upload version even if there's no local writes, though?

danieltabacaru commented 1 year ago

There is going to be at least one write for the schema, so we can leverage that. IIUC opening a realm with no schema is not very common. EDIT: We could rely on the schema version instead (is set at the end of async open), but the issue mentioned below still holds.

Side note: if we only want to use async open once when the realm is created, it will conflict with schema migrations where we require async open for phase one and the realm will most likely already exist.

tgoyne commented 1 year ago

No write will be performed to initialize the schema if it's an exact match for what the server sent us, as it'll already be initialized.

danieltabacaru commented 1 year ago

Right. We can probably find a different way. The issue with this kind of solution is that it requires opening the realm to read some data so we know how to open the realm 🙂