realm / realm-js

Realm is a mobile database: an alternative to SQLite & key-value stores
https://realm.io
Apache License 2.0
5.62k stars 558 forks source link

React Native realm bulk insert to local database is slower than expected #2154

Open HughBerriez opened 5 years ago

HughBerriez commented 5 years ago

Expected Results

Branching off of #2009

To create and insert 1000 documents, and in the long run, 35000+ documents in to a local realm db (currently an android device) in a quick timeframe using Realm's JavaScript, React Native api layer.

This all call and mapping will only happen on initial app login & load in order to support an offline first methodology.

Actual Results

Took ~47 seconds to create and insert 1000 objects into realm local db when performing a benchmark.

I've found the following https://stackoverflow.com/questions/29214236/how-to-add-1-milion-items-in-realm-correctly question on StackOverflow - upon looking in to the proposed solutions and through some of the other Realm language's API docs, such as Java, it looks like there are methods exposed that the JavaScript layer doesn't have access to - could this be my problem?

I was hoping to use this over an SQLite localDB solution b/c zero-copy sounds extremely beneficial long term.

Any suggestions on how I would speed this up? Thanks!

Steps to Reproduce

The code sample below will show my:

Code Sample

Instantiation

import Realm from 'realm';
import { ListingsAllCall } from '../utils';

// Only 2 non optional properties: lid, pid, lat && ldt
const ListingsAllCallSchema = {
  name: ListingsAllCall,
  properties: {
    ad1: 'string?',
    age: 'string?',
    agp: 'string?',
    agt: 'string?',
    aid: 'string?',
    bbc: 'string?',
    bed: 'int?',
    bkg: 'string?',
    bld: 'string?',
    bmt: 'bool?',
    bth: 'int?',
    cpl: 'bool?',
    cty: 'string?',
    dom: 'string?',
    dty: 'string?',
    esl: 'string?',
    est: 'string?',
    hfe: 'int?',
    hoa: 'bool?',
    hsl: 'string?',
    img: { type: 'string[]', optional: true },
    lat: 'float',
    ldt: 'string',
    lid: 'string?',
    lon: 'float?',
    lsz: 'int?',
    ltp: 'string?',
    lvl: 'int?',
    mls: 'string?',
    mod: 'string?',
    msl: 'string?',
    num: 'string?',
    opr: 'int?',
    pid: 'string',
    pkc: 'int?',
    pkg: 'int?',
    pks: 'int?',
    pol: 'bool?',
    pps: 'float?',
    pri: 'int?',
    ptp: 'string?',
    rmk: 'string?',
    sdp: 'int?',
    sds: 'string?',
    sft: 'int?',
    sid: 'int?',
    ste: 'string?',
    str: 'int?',
    sub: 'string?',
    syn: 'bool?',
    xad: 'bool?',
    yrb: 'int?',
    zip: 'string?'
  }
};

const realmConfigs = {
  schema: [ListingsAllCallSchema]
};

export default new Realm(realmConfigs);

Implementation


import realm from '../configs/realm';

export const handleListingsAllCall = async data => {
  console.log('Map all call');
  console.log(data);

  try {
   //Items is an array of objects. The ListingsAllCall in the realm.create method is the exact schema to be mapped to
  const { items } = data;

  realm.write(() => {
    console.log('ListingAllCall map to realm start.');

    // The following takes 47 seconds to complete.
    items.forEach(obj => {
      realm.create(ListingsAllCall, obj);
     });
   });

   let t1 = performance.now();
   console.log(`Finished writing in ${(t1 - t0) / 1000} seconds...`);
  } catch (err) {
    console.log(err);
  }
};

Redux Async Action Dispatch

export const initialListingsAllCall = () => {
  return async dispatch => {
    const result = await listingsActiveAllCallAPI();
    await handleListingsAllCall(result);
    dispatch({ type: INITIAL_LISTINGS_ALL_CALL });
  };
};

Version of Realm and Tooling

bmunkholm commented 5 years ago

You got a bunch of properties there, and it's not entirely clear how big the strings and image is and such. Did you try to benchmark this with SQLite as well?

HughBerriez commented 5 years ago

@bmunkholm Yeah, I feel that.. 53 is more than a few for sure. That being said, I just benchmarked it today with SQLite, and compared to Realm.create's 47 secs, SQLite took 0.7 seconds.

bmunkholm commented 5 years ago

Wauu yeah that looks like we messed up something! In general all other SDKs beats or equals SQLite, so it looks like there is room for improvement! :-) It would be extremely helpful if you could provide us the full project including the SQLite comparison so we could profile iwith your data.

HughBerriez commented 5 years ago

@bmunkholm Haha awesome to hear! And I'd love to help - dunno if my company will allow this though as it's not my code and endpoints I'd be providing - you all may be on your own there.

Thanks again for all the help and good luck!

SogoGolf commented 5 years ago

i just came across this when snooping around to see if anyone else was having reallllly slowwww realmjs api performance. we've got a nodejs/realmjs api data upload for a migration that is currently underway (to a ROS based server solution) that's gone from ~ 10min max (~ 250,000 rows with lots of transforms, fields etc) to now about 30-40min+. no code changes but we just bumped the realmjs version to 2.21.0

im not 100% sure if the realmjs api is the reason yet, but thought id note it here. will revert to a prior version, do some checks and post back here when done

bmunkholm commented 5 years ago

Thanks for the note @SogoGolf - that will be very useful to know if we introduced a regression.

HughBerriez commented 5 years ago

@bmunkholm Hey, quick update - looks like the SQLite insert transaction of 28k records took 0.7 secs to insert in memory, then upon close of the transaction where it writes to the actual .db, it takes over 60 seconds. Apologies for this oversight.

Still, as I dug deeper in to this issue, it seems optimization of this write time is possible in SQLite3 by quite a drastic amount, sadly, again, those methods aren't exposed for me to utilize at the JS NPM package level from what I can yet tell.

Hoping to give Realm another chance if you think there's a chance the 47 sec write may drop in future updates. Thanks again!

This link is the way to natively make SQLite3 a workhorse - sadly, I'm unable to do the following suggestions at the JS binding level

https://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite?rq=1

bmunkholm commented 5 years ago

@HughBerriez Thanks for the update! In general Both Realm and SQlite are quite optimized, and I expect them to roughly equally fast for writes as that will most likely be dominated by fsync'ing to the physical filesystem.

But curious to hear if @SogoGolf can confirm a regression?

HughBerriez commented 5 years ago

@bmunkholm Gotcha - much appreciated!

HughBerriez commented 5 years ago

@bmunkholm I hate to say it... I made a mistake. I forgot that the benchmark I performed for Realm was only on 1000 records, and took 47 secs. It took SQLite about 68 secs on average to insert 28,000 records. Rediscovered this the hard way. Let me know if you find anything in the near future for why this may be!

bmunkholm commented 5 years ago

@HughBerriez Ahh - you disappointed me there... Well it would be super much simpler if you could share your benchmark. I can't see how you are using SQLite or Realm so it's hard to spot any (other :-) ) things that could explain the difference. If you could share an anonymized version of your benchmark that would make it more likely that we can look into it sooner.

waltermvp commented 5 years ago

@SogoGolf what version did you bump from? Im running into the same. It takes about 7 seconds to save 30 lcoations on an iphone 7,

export class FSLocation {
  static schema = {
    name: "FSLocation",
    properties: {
      id: "string",
      name: "string",
      age: { type: "int", default: 0 },
      lat: { type: "double", default: 0 },
      lon: { type: "double", default: 0 },
      hereNowCount: { type: "int", default: 0 },
      verified: { type: "bool", default: 0 }
    },
    primaryKey: "id"
  };
const storeLocations = (response, callback) => {
  console.log("Will open");
  Realm.open({ schema: [FSLocation] })
    .then(realm => {
      console.log("opened realm");
      // Create Realm objects and write to local storage
      realm.write(() => {
        response.venues.map(venue => {
          const location = realm.create(
            "FSLocation",
            {
              id: venue.id,
              name: venue.name,
              lat: venue.location.lat,
              lon: venue.location.lng,
              hereNowCount: venue.hereNow.count,
              verified: venue.verified
            },
            true // Auto uniquing via primaryKey which is `id`
          );
          console.log("did save");
        });
      });
    })
    .catch(error => {
      console.log(error);
    });
};
waltermvp commented 5 years ago

@bmunkholm i can share more if you let me know what you need, in my case im just feeding it an array of Fourswaure venue objects.

These are my deps let me know if it helps :)

  "dependencies": {
    "aws-amplify": "^1.1.18",
    "aws-amplify-react-native": "^2.1.6",
    "babel-preset-es2015": "^6.24.1",
    "buffer": "^5.0.7",
    "lodash": "^4.17.11",
    "mime-type": "^3.0.5",
    "native-base": "^2.10.0",
    "path": "^0.12.7",
    "prop-types": "^15.6.0",
    "react": "16.0.0-alpha.12",
    "react-devtools": "^2.5.0",
    "react-foursquare": "^1.0.3",
    "react-native": "0.47.1",
    "react-native-elements": "^0.15.0",
    "react-native-fetch-blob": "^0.10.8",
    "react-native-prompt": "^1.0.0",
    "react-native-uuid": "^1.4.9",
    "react-native-vector-icons": "^4.3.0",
    "react-navigation": "1.0.0-beta.12",
    "realm": "^2.21.1"
  },
bmunkholm commented 5 years ago

The fastest for us is if you can share a repo that shows a benchmark emulating the issue. One thing to consider is that opening/closing the realm is an expensive operation. Some will do that for each object, which surely is not recommended. We can't for instance see in the above what you are timing.

waltermvp commented 5 years ago

@bmunkholm i can share a demo momentarily. But given the code i shared:

maxammann commented 5 years ago

Hello, just wanted to let you know that I have the same problem. I'm inserting about 30 simple objects and it takes 1,7seconds. Just testing various databases currently. Here is the code though: https://github.com/Integreat/integreat-react-native-app/blob/2ddb9888c3d3f39d4030d7caf0b4ad263c64d26e/src/modules/endpoint/sagas/fetchCategories.js#L91

waltermvp commented 5 years ago

@maxammann beat me to it, thanks

maxammann commented 5 years ago

This is probably not a good example for a clean benchmark, just testing it with my project. Are you guys sure that this is a bug of realm/realm-js? Currently desperatly looking for a good database for react-native and the only other option would be to write plain-old SQL for SQLite. So it would be very cool if realm-js would have the same performance as any SQLite ORM :)

bmunkholm commented 5 years ago

Ok I think I know what you are running into. You are testing this is a debugger/browser? realm-js is terrible slow when debugged. But when you run it normally - it's fast.

const Realm = require('realm')

Realm.open({
    schema: [{
        name: 'Page',
        properties: {
            id: 'int'
        }
    }]
}).then(realm => {
    console.time('nothing')
    console.timeEnd('nothing')

    console.time('warmup')
    realm.write(() => {
        for (let i = 0; i < 10; i++) {
            realm.create('Page', {id: 5})
        }
    })
    console.timeEnd('warmup')

    for (let objects=0; objects<=1000; objects+=100) {
        let test = 'Objects ' + objects
        console.time(test)
        realm.write(() => {
            for (let i = 0; i < 100; i++) {
                realm.create('Page', {id: 5})
            }
        });
        console.timeEnd(test)
    }

    process.exit()
});

If I run this in node (on an old macbook) I get this:

nothing: 0.058ms
warmup: 0.946ms
Objects 0: 1.361ms
Objects 100: 0.810ms
Objects 200: 1.023ms
Objects 300: 1.201ms
Objects 400: 1.019ms
Objects 500: 0.964ms
Objects 600: 0.778ms
Objects 700: 0.693ms
Objects 800: 0.710ms
Objects 900: 0.686ms
Objects 1000: 0.816ms

so around 1 ms for 100 objects.

Hope this helps you :-)

bmunkholm commented 5 years ago

But @waltermvp you say it's on an iphone? How exactly do you measure this? Could you cut it completely down to a selfcontained mini app so it can be replicated?

waltermvp commented 5 years ago

@bmunkholm yes its an iPhone, will test with a non debug mode, does android experience the same in debug?

waltermvp commented 5 years ago

didn't create a project since @maxammann shared an example.

waltermvp commented 5 years ago

@bmunkholm MUCH faster using a test flight build (not debug) that was it thanks!

bmunkholm commented 5 years ago

@HughBerriez Did you also test in debug mode/setup somehow?

waltermvp commented 5 years ago

@HughBerriez yes in debug its very slow. Once I make a non debug build its MUCH faster

maxammann commented 5 years ago

Can confirm that on a release build on android the time drops from ~1700ms to <2ms. We definitely should find the source of this problem though. Developing with those delays will be very annoying.

My first guess would be that react-native chooses different js bundles of realm-js based on the compilation mode (relase, debug).

maxammann commented 5 years ago

Not sure whether the reason is that a debugger is attached or it is because of the debug compilation mode. Can not test this currently though: https://github.com/facebook/react-native/issues/22956

This issue if probably a duplicate of https://github.com/realm/realm-js/issues/491

bmunkholm commented 5 years ago

Yes if Chrome debugger is connected it is the same issue. I recommend not using that, but use the other tools mentioned. I'll close this then.

maxammann commented 5 years ago

fyi here is the detailed answer: https://github.com/realm/realm-js/issues/491#issuecomment-350718316

maxammann commented 5 years ago

@bmunkholm what are the other tools?

bmunkholm commented 5 years ago

Safari for one. It's referenced in the readme: https://github.com/realm/realm-js#issues-with-debugging

ferrannp commented 5 years ago

Even if this is fast, wouldn't it be better to have a bulk method? I understand every time we call react.create we are accessing the RN<->Native bridge? Wouldn't it be more optimal just to send everything to native at once?

Correct me if I am wrong (maybe you are optimizing this already).

bmunkholm commented 5 years ago

@ferrannp There would surely be ways to optimize. But without a "fixed" format, you would need to serialize/deserialize in some format - that's also usually expensive.

nirinchev commented 5 years ago

@fealebenpae do you think that exposing a bulk insert would offer meaningful performance difference in js? My impression was that the js-native transitions were super cheap (unlike in Java/.NET), so saving these calls would not make a huge difference, especially since native has to then fetch the object properties one by one anyway.

fealebenpae commented 5 years ago

We don't use the ReactNative bridge so we don't incur that cost, but the JavaScriptCore engine APIs we do use to integrate have a little overhead, especially if the data has a lot of strings that need to be copied over, and I believe that ReactNative Android is running a really old JavaScriptCore version where the C APIs are much slower than the modern version that comes with iOS.

I believe that for a dataset of a certain shape (many records and/or lots of string fields) there could likely be a performance win if there is a Realm.create overload that accepts a JSON string where the payload is a single object or array of objects that match the provided schema.

ferrannp commented 5 years ago

And maybe same for bulk delete? I am just saying but if you say you do not use the bridge as it is, maybe there are no performance wins. Regarding Android old JSC @fealebenpae, that will be updated in 0.59.0 so real soon.

fealebenpae commented 5 years ago

You can already call Realm.delete() on a Realm.Results instance that contains multiple objects, such as a query like realm.objects(...).filtered(...). In any case, deleting a single object is fast enough because there’s no data transfer from JavaScript to C++.