realm / realm-object-server

Tracking of issues related to the Realm Object Server and other general issues not related to the specific SDK's
https://realm.io
293 stars 42 forks source link

Long sync times after months of transactions. #328

Closed Jonsapps closed 6 years ago

Jonsapps commented 6 years ago

Goals

All of our apps use shared common synced realms between all our users. Some apps write regularly to the shared database creating a large amount of transactions. Because of the nature of our apps we need to have data synced before the app is ready to be used. Our current model is when the app is opened the synced realm is connected to and the initial sync begins, we monitor the sync notifications to trigger a progress bar and dismiss this once the sync is finished. In general this works very well and as expected if the app is opened regularly. However as time passes and more transactions are made against the database the initial data sync takes longer and longer when doing a clean install of the app. We are now at the point where our initial sync takes around 1 hour to complete, this is around 6 months worth of transactions.

If we create a new realm and copy over the data we find the sync is much quicker and so it appears to be relating to the sync of the entire transaction history as appose to the amount of data in the Realm.

Expected Results

Ultimately the sync shouldn't take so long, we can only force this becoming worse. It would be expected that only the most up to date transactions are synced.

Actual Results

The entire history of the realm is synced meaning over time this becomes larger and longer to sync.

Steps to Reproduce

Create a new Realm and continually write/delete/update data. Monitor the time taken for the initial sync of the Realm on a clean install of the app. Notice the time to sync increaes over time.

Code Sample

I can supply any samples, I'm unsure what would be relevant at this stage. I'm happy to supply any outputs from the ROS that might give insight into the transaction history.

Version of Realm and Tooling

Logs

systemctl status realm-object-server.service ● realm-object-server.service - Realm Object Server Loaded: loaded (/etc/systemd/system/realm-object-server.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2017-11-03 12:21:54 UTC; 2 months 7 days ago Main PID: 1162 (realm-object-se) Tasks: 22 Memory: 43.2M CPU: 5min 48.739s CGroup: /system.slice/realm-object-server.service ├─1162 /usr/lib/realm-object-server-developer/node/bin/node /usr/bin/realm-object-server -c /etc/realm/configuration.yml └─1519 /usr/lib/realm-object-server-developer/node/bin/node /usr/lib/nodejs/realm-object-server-developer/node_modules/realm-functions/lib/wrapper Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.
sudo journalctl -u realm-object-server.service -- No entries --
cat /var/log/realm-object-server.log 2017-10-25T15:09:54.068Z - info: sync: HTTP Connection[1982]: Connection from 127.0.0.1:55100 2017-10-25T15:09:54.070Z - info: sync: HTTP Connection[1982]: Connection is closed after HTTP response. 2017-10-27T08:08:03.288Z - warn: auth.password: Handle request failed with: InvalidCredentials: The provided credentials are invalid or a user does not exist.: Given account: realm-admin 2017-10-31T11:55:16.595Z - warn: auth.password: Handle request failed with: InvalidCredentials: The provided credentials are invalid or a user does not exist.: Given account: realm-admin 2017-11-03T09:05:22.875Z - warn: auth.password: Handle request failed with: InvalidCredentials: The provided credentials are invalid or a user does not exist.: Given account: realm-admin 2017-11-03T12:21:22.159Z - info: Closing gracefully. Caught signal SIGTERM. 2017-11-03T12:21:56.902Z - info: Backup admin-Realm file written to /var/lib/realm/object-server/internal_data/oldAuth.realm.bak (in case of unexpected failures during the upgrade). 2017-11-03T12:21:57.197Z - info: sync: Realm sync server started ([realm-core-2.8.6], [realm-sync-1.10.5]) 2017-11-03T12:21:57.201Z - info: sync: Directory holding persistent state: /var/lib/realm/object-server/0/user_data 2017-11-03T12:21:57.201Z - info: sync: Listening on 127.0.0.1:44808 (sync protocol version 18) 2017-11-03T12:21:57.203Z - info: sync: Realm Object Server sync engine listening on 127.0.0.1:44808. 2017-11-03T12:21:57.210Z - info: Realm Object Server web server listening on 127.0.0.1:27080. 2017-11-03T12:21:57.213Z - info: proxy: http proxy listening on :::9080. 2017-11-03T12:21:57.228Z - info: permissions: Seed permission-Realms 2017-11-03T12:21:57.241Z - info: permissions: Completed seeding permission-Realms 2017-11-03T12:21:57.258Z - info: sync: HTTP Connection[1]: Connection from 127.0.0.1:40806 2017-11-03T12:21:57.265Z - info: sync: HTTP Connection[1]: Received: Sync HTTP request(protocol_version=18) 2017-11-03T12:21:57.265Z - info: sync: Sync Connection[1]: Session[1]: Session initiated (session_ident=1). 2017-11-03T12:21:57.266Z - info: sync: Sync Connection[1]: Session[1]: Received: BIND(server_path=/__admin, signed_user_token='...CyhxKXzObaG5eTwzCfDPpNhTefHQ==', need_file_ident_pair=0) 2017-11-03T12:21:57.266Z - info: sync: Sync Connection[1]: Session[1]: Received: IDENT(server_file_ident=1, client_file_ident=2, client_file_ident_secret=6727323928319869236, scan_server_version=14, scan_client_version=35, latest_server_version=14, latest_server_session_ident=4857784446302601620) 2017-11-03T12:21:57.266Z - info: sync-client: Opening Realm file: /var/lib/realm/object-server/internal_data/auth.realm 2017-11-03T12:21:57.266Z - info: sync-client: Connection[1]: Session[1]: Starting session for '/var/lib/realm/object-server/internal_data/auth.realm' 2017-11-03T12:21:57.267Z - info: sync-client: Connection[1]: Resolving '127.0.0.1:44808' 2017-11-03T12:21:57.267Z - info: sync-client: Connection[1]: Connecting to endpoint '127.0.0.1:44808' (1/1) 2017-11-03T12:21:57.271Z - info: sync-client: Connection[1]: Connected to endpoint '127.0.0.1:44808' (from '127.0.0.1:40806') 2017-11-03T12:21:57.271Z - info: sync-client: Connection[1]: Session[1]: Sending: BIND(server_path='/__admin', signed_user_token_size=469, need_file_ident_pair=0) 2017-11-03T12:21:57.271Z - info: sync-client: Connection[1]: Session[1]: Sending: IDENT(server_file_ident=1, client_file_ident=2, client_file_ident_secret=6727323928319869236, scan_server_version=14, scan_client_version=35, latest_server_version=14, latest_server_session_ident=4857784446302601620) 2017-11-03T12:21:57.271Z - info: sync-client: Opening Realm file: /var/lib/realm/object-server/tmp/realm-object-server/listener/__admin.realm 2017-11-03T12:21:57.271Z - info: sync-client: Connection[2]: Session[2]: Starting session for '/var/lib/realm/object-server/tmp/realm-object-server/listener/__admin.realm' 2017-11-03T12:21:57.272Z - info: sync-client: Connection[2]: Resolving ':::9080' 2017-11-03T12:21:57.272Z - info: sync-client: Connection[2]: Connecting to endpoint ':::9080' (1/1) 2017-11-03T12:21:57.277Z - info: sync-client: Connection[2]: Connected to endpoint ':::9080' (from '::1:58318') 2017-11-03T12:21:57.286Z - info: sync-client: Opening Realm file: /var/lib/realm/object-server/internal_data/permission/__auth.realm 2017-11-03T12:21:57.286Z - info: sync-client: Connection[3]: Session[3]: Starting session for '/var/lib/realm/object-server/internal_data/permission/__auth.realm' 2017-11-03T12:21:57.286Z - info: sync-client: Connection[3]: Resolving ':::9080' 2017-11-03T12:21:57.286Z - info: sync-client: Connection[3]: Connecting to endpoint ':::9080' (1/1) 2017-11-03T12:21:57.286Z - info: sync-client: Connection[3]: Connected to endpoint ':::9080' (from '::1:58320') 2017-11-03T12:21:57.287Z - info: sync-client: Opening Realm file: /var/lib/realm/object-server/internal_data/permission/__global.realm 2017-11-03T12:21:57.288Z - info: sync-client: Connection[4]: Session[4]: Starting session for '/var/lib/realm/object-server/internal_data/permission/__global.realm' 2017-11-03T12:21:57.288Z - info: sync-client: Connection[4]: Resolving ':::9080' 2017-11-03T12:21:57.288Z - info: sync-client: Connection[4]: Connecting to endpoint ':::9080' (1/1) 2017-11-03T12:21:57.288Z - info: sync-client: Connection[4]: Connected to endpoint ':::9080' (from '::1:58322') 2017-11-03T12:21:57.318Z - info: sync-client: Opening Realm file: /var/lib/realm/object-server/tmp/realm-object-server/listener/realms/__admin.realm 2017-11-03T12:21:57.319Z - info: sync-client: Connection[5]: Session[5]: Starting session for '/var/lib/realm/object-server/tmp/realm-object-server/listener/realms/__admin.realm' 2017-11-03T12:21:57.319Z - info: sync-client: Connection[5]: Resolving ':::9080' 2017-11-03T12:21:57.319Z - info: sync-client: Connection[5]: Connecting to endpoint ':::9080' (1/1) 2017-11-03T12:21:57.319Z - info: sync-client: Connection[5]: Connected to endpoint ':::9080' (from '::1:58324') 2017-11-03T12:21:57.376Z - info: sync: HTTP Connection[2]: Connection from 127.0.0.1:40818 2017-11-03T12:21:57.376Z - info: sync: HTTP Connection[3]: Connection from 127.0.0.1:40820 2017-11-03T12:21:57.376Z - info: sync: HTTP Connection[4]: Connection from 127.0.0.1:40822 2017-11-03T12:21:57.376Z - info: sync: HTTP Connection[5]: Connection from 127.0.0.1:40824 2017-11-03T12:21:57.380Z - info: sync: HTTP Connection[2]: Received: Sync HTTP request(protocol_version=18) 2017-11-03T12:21:57.381Z - info: sync: HTTP Connection[3]: Received: Sync HTTP request(protocol_version=18) 2017-11-03T12:21:57.381Z - info: sync: HTTP Connection[4]: Received: Sync HTTP request(protocol_version=18) 2017-11-03T12:21:57.381Z - info: sync: HTTP Connection[5]: Received: Sync HTTP request(protocol_version=18) 2017-11-03T12:21:57.394Z - info: sync-client: Connection[2]: Session[2]: Sending: BIND(server_path='/__admin', signed_user_token_size=469, need_file_ident_pair=0) 2017-11-03T12:21:57.394Z - info: sync-client: Connection[2]: Session[2]: Sending: IDENT(server_file_ident=1, client_file_ident=3, client_file_ident_secret=7118852584400628523, scan_server_version=14, scan_client_version=2, latest_server_version=14, latest_server_session_ident=4857784446302601620) 2017-11-03T12:21:57.394Z - info: sync-client: Connection[3]: Session[3]: Sending: BIND(server_path='/__auth/__permission', signed_user_token_size=469, need_file_ident_pair=0) 2017-11-03T12:21:57.394Z - info: sync-client: Connection[3]: Session[3]: Sending: IDENT(server_file_ident=1, client_file_ident=2, client_file_ident_secret=5634577793068629223, scan_server_version=10, scan_client_version=20, latest_server_version=10, latest_server_session_ident=8860255566073234533) 2017-11-03T12:21:57.396Z - info: sync-client: Connection[4]: Session[4]: Sending: BIND(server_path='/__permission', signed_user_token_size=469, need_file_ident_pair=0) 2017-11-03T12:21:57.396Z - info: sync-client: Connection[4]: Session[4]: Sending: IDENT(server_file_ident=1, client_file_ident=2, client_file_ident_secret=8412823719753081995, scan_server_version=8, scan_client_version=17, latest_server_version=8, latest_server_session_ident=1073487229620907151) 2017-11-03T12:21:57.397Z - info: sync-client: Connection[5]: Session[5]: Sending: BIND(server_path='/__admin', signed_user_token_size=469, need_file_ident_pair=0) 2017-11-03T12:21:57.397Z - info: sync-client: Connection[5]: Session[5]: Sending: IDENT(server_file_ident=1, client_file_ident=4, client_file_ident_secret=7884013206642083830, scan_server_version=14, scan_client_version=0, latest_server_version=14, latest_server_session_ident=4857784446302601620) 2017-11-03T12:21:57.398Z - info: sync: Sync Connection[2]: Session[2]: Session initiated (session_ident=2). 2017-11-03T12:21:57.399Z - info: sync: Sync Connection[2]: Session[2]: Received: BIND(server_path=/__admin, signed_user_token='...CyhxKXzObaG5eTwzCfDPpNhTefHQ==', need_file_ident_pair=0) 2017-11-03T12:21:57.399Z - info: sync: Sync Connection[2]: Session[2]: Received: IDENT(server_file_ident=1, client_file_ident=3, client_file_ident_secret=7118852584400628523, scan_server_version=14, scan_client_version=2, latest_server_version=14, latest_server_session_ident=4857784446302601620) 2017-11-03T12:21:57.399Z - info: sync: Sync Connection[3]: Session[3]: Session initiated (session_ident=3). 2017-11-03T12:21:57.399Z - info: sync: Sync Connection[3]: Session[3]: Received: BIND(server_path=/__auth/__permission, signed_user_token='...CyhxKXzObaG5eTwzCfDPpNhTefHQ==', need_file_ident_pair=0) 2017-11-03T12:21:57.412Z - info: sync: Sync Connection[3]: Session[3]: Received: IDENT(server_file_ident=1, client_file_ident=2, client_file_ident_secret=5634577793068629223, scan_server_version=10, scan_client_version=20, latest_server_version=10, latest_server_session_ident=8860255566073234533) 2017-11-03T12:21:57.415Z - info: sync: Sync Connection[4]: Session[4]: Session initiated (session_ident=4). 2017-11-03T12:21:57.415Z - info: sync: Sync Connection[4]: Session[4]: Received: BIND(server_path=/__permission, signed_user_token='...CyhxKXzObaG5eTwzCfDPpNhTefHQ==', need_file_ident_pair=0) 2017-11-03T12:21:57.416Z - info: sync: Sync Connection[5]: Session[5]: Session initiated (session_ident=5). 2017-11-03T12:21:57.416Z - info: sync: Sync Connection[5]: Session[5]: Received: BIND(server_path=/__admin, signed_user_token='...CyhxKXzObaG5eTwzCfDPpNhTefHQ==', need_file_ident_pair=0) 2017-11-03T12:21:57.417Z - info: sync: Sync Connection[4]: Session[4]: Received: IDENT(server_file_ident=1, client_file_ident=2, client_file_ident_secret=8412823719753081995, scan_server_version=8, scan_client_version=17, latest_server_version=8, latest_server_session_ident=1073487229620907151) 2017-11-03T12:21:57.417Z - info: sync: Sync Connection[5]: Session[5]: Received: IDENT(server_file_ident=1, client_file_ident=4, client_file_ident_secret=7884013206642083830, scan_server_version=14, scan_client_version=0, latest_server_version=14, latest_server_session_ident=4857784446302601620) 2017-11-03T12:21:57.461Z - info: sync-client: Closing Realm file: /var/lib/realm/object-server/internal_data/permission/__global.realm 2017-11-03T12:21:57.461Z - info: sync-client: Connection[4]: Session[4]: Sending: UNBIND 2017-11-03T12:21:57.466Z - info: sync: Sync Connection[4]: Session[4]: Session terminated (session_ident=4). 2017-11-03T12:21:57.466Z - info: sync: Sync Connection[4]: Connection closed by client: End of input 2017-11-03T12:22:37.918Z - info: sync-client: Closing Realm file: /var/lib/realm/object-server/internal_data/permission/__auth.realm 2017-11-03T12:22:37.919Z - info: sync-client: Connection[3]: Session[3]: Sending: UNBIND 2017-11-03T12:22:37.920Z - info: sync: Sync Connection[3]: Session[3]: Session terminated (session_ident=3). 2017-11-03T12:22:37.923Z - info: sync: Sync Connection[3]: Connection closed by client: End of input
bigfish24 commented 6 years ago

Can you enable log compaction on this server following these instructions: https://github.com/realm/realm-object-server/issues/127#issuecomment-349630076

Apologize it isn’t documented and I will rectify that.

Jonsapps commented 6 years ago

I'm not seeing any change by adding this to my index.js. The sync time after adding the following line maxDownloadSize: 10000000 Is still at 56 mins.

Am I using this correctly?

bigfish24 commented 6 years ago

Try setting it to an even larger value, this is 10MB, but your dataset has a log that might exceed this amount, so continue to try larger values and see if that improves things, say 100MB then 1GB.

Jonsapps commented 6 years ago

When I set the maxDownloadSize to 10MB it was around 56Mins I also tried 2GB which took longer to start and seemed to download quickly then spent the time (I assume) processing. End to end this took 1 hour and 1 min. I have tried varying sizes between 10Mb and 2GB which appear to add or remove a few mins but no big benefit. What I did notice was that if I set the limit to be very high, such as 5GB the server would run out of memory and grind to a halt. Am I right in thinking ROS compacts in memory up to the max value then sends down the change sets to the client for it to add to it's local Realm? This is how I arrived at 2GB as my max value because I have 3GB of RAM on my AWS instance, which meant it didn't use up all the system RAM when the sync starts as it did when I set it to 5GB.

The other thing I don't understand is due to my issues migrating my data (Issue 338) I have resorted to copy the data from my old realm to my new realm which should mean the transaction log is minimal, the realm file once the sync is complete is 469MB yet the network traffic to the device during the sync reached 2GB.

Ultimately I am seeing very little real world benefit from this setting at the moment. If it helps I can provide connection details to my ROS since it is hosted on AWS.

simonask commented 6 years ago

This is highly valuable information, that you for reporting it!

Is there anything you can tell us about your data model and how you use Realm? For example, log compaction is able to eliminate redundant transaction log entries, but if your Realm file really does contain many, many small objects that are still alive, and that aren't modified so frequently, there isn't much it can achieve.

In general, we expect a minimal transaction log stream to be about 1.5x the size of the data it creates in the Realm file when all redundancy is eliminated. If your complete Realm file is 469 MB, I would expect at least 700 MB of data to be transferred, but certainly not 2 GB (fuzz factors could be things like lots of large binary data objects).

Even so, 700 MB is a lot over a mobile connection. Is it feasible to split your dataset into multiple Realm files?

Jonsapps commented 6 years ago

We use shared realms for our projects and treat Realm as a replacement for a traditional database. This means all our users write to the same realm. We have objects in our realm that have images so this could account for the larger size. Objects are added and read but they are very rarely modified.

Splitting the data might be something I need to explore more but I have been unable to think up a solution that works. The way we use realm means we need to sync all the data before the user can use the app because we manage user accounts and permissions as realm objects (so users can't even login until the realm has synced and the permissions are available to check) Is it even possible to sync multiple realms?

simonask commented 6 years ago

Is it even possible to sync multiple realms?

Yes, that is definitely possible. :-) It may be a little inconvenient if your app is written to rely heavily on the "default" Realm though. Please consult your relevant platform SDK docs.

In general, I think we discourage putting large static binary things like images in the Realm file. We haven't spent a lot of effort optimizing that use case because it is usually easy enough to put such files in a dedicated storage like S3 and store URLs in the Realm file instead.

One reason why large binary objects are less than ideal is that Realm transfers updates linearly (no "out-of-order" object transfers), so binary objects are transferred in the same order they were put in the Realm file. But it's likely that the user is more interested in receiving the app data first and then things like pictures on-demand. The end result could be a very sluggish experience, since the priority of data transfers may not match the priorities of the UI.

We have considered adding a "prioritized" data transfer mode and I think it is still something we want to do eventually, but it is relatively complex without sacrificing transactional integrity in some way, and also risks complicating app code by introducing asynchronicity.

Jonsapps commented 6 years ago

Ok this gives me some optimisation to try. Because I am having to copy data from my old realm to my new realm it gives me the opportunity to manipulate the data before it is transferred. I will do that and see what difference I see in the sync time. The Realm.AsyncOpen (swift) function is what I'm having to use now because without it and opening the realm in the normal manner causes ROS to crash for some reason. Is there any way to monitor sync progress when using AsyncOpen and is it right that AsyncOpen is an all or nothing sync, in that if I close the app half way through the initial sync it will start again from the beginning on the next launch until it's completed?

bigfish24 commented 6 years ago

asyncOpen is all or nothing. To get progress you have to open normally then add a progress notification callback. It would be great if you could provide the logs of what happens to ROS when you do this and it crashes.

Jonsapps commented 6 years ago

Sure thing, where are the logs stored?

Jonsapps commented 6 years ago

I have ran the transfer of data from my old realm to my new instance. While copying the data I set all Data properties to nil. The sync still took the same amount of time, just shy of 1 hour, yet the local realm when the sync finished is only 269MB instead of 469MB I would expect the sync to have been much quicker if the end realm size is so much smaller.

I ran this with the maxDownloadSize set at 2GB and 1GB. Since removing the image data I appear to be able to run a standard sync (not using AsyncOpen) without killing the server. Although a standard sync isn't firing the progress notification (I raised that issue here Issue 5454) so I'm unable to see the total number of bytes expected for the sync.

Jonsapps commented 6 years ago

From what I can see the download of the data is pretty quick but the majority of the time is spent with the console displaying messages like the following

2018-01-26 00:31:16.829780+0000 Hub[13913:13266238] Sync: Connection[1]: Session[1]: table = group->get_table(table_for_class_name("User"));

What task is being preformed there? Is that when it is writing the downloaded change set into the local realm?

bigfish24 commented 6 years ago

How are you running the server (to get the logs)?

simonask commented 6 years ago

The sync still took the same amount of time, just shy of 1 hour, yet the local realm when the sync finished is only 269MB instead of 469MB I would expect the sync to have been much quicker if the end realm size is so much smaller.

Yes, unfortunately we don't persist the compacted history on the server yet, so it just stores the full stream of updates (including large binary objects that get overwritten anyway). Realm works like Git in this sense.

We are definitely planning to let the server be more clever about this in the future, but it hasn't been the highest priority so far given that it mainly becomes a serious problem when there are many large binary objects in the database. We are also currently working on other features that partially mitigate the issue by letting clients download only exact the necessary amount of data. :-)

Jonsapps commented 6 years ago

Shouldn't I have a clean transaction history because of the way I am migrating my data? I am copying values from a local realm file which is a compacted realm from my ROS V1 before they are copied to the ROS V2 the binary objects are removed and the new object inserted into ROS V2. So shouldn't I have a ROS V2 that is unaware of the binary objects and has no history of them?

Jonsapps commented 6 years ago

I am running my ROS V2 instance using "pm2 start /src/index.js"

simonask commented 6 years ago

Shouldn't I have a clean transaction history because of the way I am migrating my data? I am copying values from a local realm file which is a compacted realm from my ROS V1 before they are copied to the ROS V2 the binary objects are removed and the new object inserted into ROS V2. So shouldn't I have a ROS V2 that is unaware of the binary objects and has no history of them?

Ah, gotcha, I misunderstood, sorry about that. Yes you are right, you should see a minimal transaction log history in the new Realm file.

Jonsapps commented 6 years ago

Which suggests that because the transaction history is now much smaller, the reason for the sync taking so long ins't due to downloading the data and is due to the time it's taking to process that data once it's downloaded. Could the design of my realm schema cause this?

simonask commented 6 years ago

Yes, I agree with your analysis. The schema can definitely have an impact. Do you see high CPU usage when waiting for the download to be processed? Are you able to share the schema? Feel free to send it to us at help@realm.io if you are not able to share it publicly. :-)

Jonsapps commented 6 years ago

Yes, after the download (when the network activity stops) I see high CPU usage. How can I best share the schema with you? Would you need the swift classes? Or I can use realm studio to output the classes to a different language?

simonask commented 6 years ago

Swift models are fine :) It's also very useful if you can say anything about how the database is most frequently modified - for example, if you prepend items to collections a lot, that might have something like quadratic performance.

Jonsapps commented 6 years ago

Thank you, I have emailed over my schema

simonask commented 6 years ago

The issue turned out to be a very aggressive log level, which caused the client to spend a lot of time just printing log messages while integrating changes coming from the server. Closing this. :-)