⚡️ 🍉 Performance, flickering, bisynchronicity - update&RFC

radex commented 4 years ago

An update on what I've been working on recently, and my plans for the upcoming weeks and months. This is a request for comments, too, so please feel free to comment with your thoughts.

Performance

First of all: I've been working on making 🍉 really, really fast. Lazy loading of data making app launch time fast has been a selling point for WatermelonDB from day one. But some areas sucked. For example: adding massive amounts of data at once has not been very fast on iOS and Android.

I've made huge progress in 0.15, making sync time 5x faster on web and 23x faster on iOS + a lot of incremental improvements here and there. Android is not yet fast. More on that later.

Flickering

From the very beginning, 🍉 has had a fully asynchronous API, based on Promises and Observables.

Long story short, this is partly due to necessity — in 2017-2018 there has not been a good/easy/sanctioned way in React Native to make synchronous Native Modules (this has been a big selling feature for people coming from Realm RN which is really annoying to use with Chrome debugger because it's designed as fully synchronous, and remote debugger is not). Partly due to a belief that since databases are heavy, harnessing the power of parallelism/multithreading (both on React Native, and on the web using web workers), we'll be able to make our app a lot faster. And there are a few more potential powerful features that async api enables (you could make a network-based database adapter! 😱).

Buuuuuut. There's just this thing: data fetched in React components that doesn't come back synchronously means that the component will always render twice: first blank, then with content. This leads to a lot of flickering. Bad, ugly glitches, leading to poor UX.

The idea was that React Suspense is just around the corner, and it will make asynchronous data fetching and rendering really simple and awesome, and in the meantime we can use prefetching to make sure we load all necessary data ahead of time so that it's already cached by the time it's needed.

A year or more has passed, and React Suspense is still around the corner — and while amazing, it's not a magic bullet (more on that later).

And prefetching has not worked super great for us, because it's a really fragile solution. And we've never fully documented how to do this, so I suspect most 🍉 users just deal with glitches and flicker.

Synchronicity to the rescue (or is it?)

The "simple" solution to flickering is to just avoid asynchronicity and make data come back to the component immediately.

Multithreading is great, but it's not a silver bullet. Without a great, reliable prefetching strategy, it may cause more problems than it solves. There are two reasons for this:

Without prefetching, you're ordering data only when it's needed - and by that point… well… it's needed now. So you're not really getting a lot of benefit from parallelism.
Our experience says that databases are fast, and React/React Native/DOM are slow. So you're adding a lot of overhead on main thread, while only moving the minority of work to a separate thread
Without a strategy to avoid flickering, you're causing A LOT more rendering passes by asynchronous operation, which are expensive.

And so we've been experimenting with using WatermelonDB synchronously to get rid of flickering and to improve performance.

As of v0.15, I recommend using new LokiJSAdapter({ ..., useWebWorker: false, experimentalUseIncrementalIndexedDB: true }) option. It should be worse because now DB operations are blocking the main thread. But for our app, the result is MUCH better, because there are no glitches, performance is better, and memory usage is much lower!

As of v0.16.0-0 alpha version, you can use synchronous SQLite adapter on iOS only by adding { synchronous: true } experimental option to the adapter constructor. This may be removed in future release.

What about Android? I'll explain later.

JSI Adapter

I've been working for a while now on rewriting the entire SQLiteAdapter for iOS and Android with a single C++ implementation based on React Native's jsi (javascript interface). This is really challenging, and it took me many attempts to figure out how to do this. This is because jsi is not really well documented, and almost nobody outside React Native Core Team have used this directly.

You can track my progress on this effort in this PR: https://github.com/Nozbe/WatermelonDB/pull/541/files (as of writing this, an iOS playground works; Android is not yet supported - but a proof of concept of that is here: https://github.com/Nozbe/WatermelonDB/pull/490).

Here are the goals of this project:

the adapter is going to be a lot faster than previous implementation (it skips a lot of overhead of React Native Bridge; it's even more low-level than TurboModules)
this should be 2-4x on iOS, and more on Android
synchronous operation (initially; then - by default)
single implementation for all React Native platforms
initially, when JSI is still in flux and people use remote Chrome Debugger (this will change when React Native Fabric comes), JSI will be opt-in, with current implementations as fallback.

I'm not currently planning to support synchronous SQLiteAdapter on Android before it's replaced with the JSI implementation.

The bisynchronous future

So opt-in synchronicity is an important goal for now because we want to avoid flickering, and it just seems easier and better for performance, for now.

But hold on. Asynchronicity is not going away! We don't want 🍉 to be just synchronous. Nope!

We want to keep the capability of making asynchronous database adapters. That allows network adapters, and adapters on platforms that don't have synchronous native module capabilities
React Suspense is coming. It's not a silver bullet, no, but it does allow you to build things that appear synchronously, even though the data source is asynchronous. This solves the flickering problem. What's better, it allows you to vary React and WatermelonDB behavior based on device speed. You want flicker-free experience on fast devices. But if the device is just too slow to render content in, say, 250ms, you do want progressive rendering for the perception of speed.
When we have that, we can start thinking about multithreading again. If we have a better solution fo prefetching, we could realistically parallelize a non-trivial amount of database work, with performance benefits
Another thing that's coming is React Scheduler. We want to be able to hook into it -- get data immediately (synchronously) when it's needed now, but be able to fetch data with lower priority asynchronously — for UI-only data updates (for example, updating counters), or for pre-rendering off-screen content (list virtualization).

I'm calling it "bisynchronicity" (I just made this word up) — meaning, WatermelonDB must be able to support both synchronous and asynchronous operation.

Aaaaand back to present

So this is great for the future, but we need good UX now, hence the work on synchronous operations.

There's only one catch: as of writing this, they're not really synchronous, because the entire WatermelonDB API is based on Promises and async functions, and Promises, by design, can not resolve synchronously. So even if there's no multithreading, IO, or other delays, the response is scheduled in next micro task on the runloop.

This means that react components still render twice - first with empty content, and then again once promise resolves. This is not perceivable by the user, because the micro task queue blocks browser/RN rendering (so it will render properly before painting on screen). But it has real overhead, since components go through the React machinery many times.

I've developed a proof of concept today to measure this overhead. You can check it out here: https://github.com/Nozbe/WatermelonDB/pull/575/files . I've improved interaction time of switching between views in Nozbe Teams by 10% by ensuring find, fetch, count are ACTUALLY synchronous. This is a pretty huge difference.

So to support bisynchronicity, I'm thinking about how to go about refactoring internal APIs so that they can resolve synchronously.

Promise is always async, so it doesn't work
a fake BisyncPromise thenable implementation could allow synchronous resolution, but it's just begging to be used with async/await syntax, and it's not going to be transpired correctly, so that doesn't work
Observables can emit both synchronously and asynchronously, but I don't like the idea of a 100% Rx-based API, because Observables are too broad and don't explain their intention (will this resolve synchronously and then complete, or will this emit a number of items asynchronously?); and besides - I'm planning to get rid of Rx internally (leaving only external APIs like .observe() and .observeCount()), because profiling is telling me that Rx has a non-trivial performance overhead

And so I'm thinking of plain old callbacks, like this:

count(...args, callback: Result<number> => void): void

where:

type Result<T> = { value: T } | { error: Error }
// (Result is to be treated like a standard monad, with helper functions like `mapResult`, `mapError`, `flatMapResult`)

I don't like this at all, because callbacks are really delicate and easy to screw up. But for now, I don't have a better idea that would be very lightweight, simple, and allow methods to resolve both synchronously and asynchronously.

WDYT?

diegolmello commented 4 years ago

That's great news! Can we expect this to be released in Q1/2020?

Based on the pre-requisites, callback is a good approach. Thanks.

radex commented 4 years ago

Can we expect this to be released in Q1/2020?

Which part?

Synchronous adapters - already out, except for Android
Really synchronous adapter - yep, working on this now
more performance improvements - definitely
JSI adapter - probably
next-gen async features (suspense integration, scheduling, multithreading) - no

diegolmello commented 4 years ago

@radex hahaha I'm sorry. I was talking about the whole sync part, but it's nice to see there're even more excited improvements for short term.

OtacilioN commented 4 years ago

Awesome news ❤️ I think this is a big step to WatermelonDB 🚀

kilbot commented 4 years ago

Hi @radex, I like that you are pushing the library forward, and you have the experience of using watermelonDB in production so I trust any direction you choose ...

but ... 😁

Don't you think this is a major change to implement when Suspense is so close. It seems that you could get rid of the flickering problem right now using the experimental release of a React.

I guess it just seems like a step backwards rather than working on things like Suspense integration and multithreading now, and have WatermelonDB ready on day one when Suspense finally does land as a stable release.

radex commented 4 years ago

@kilbot Perhaps you're right and I should focus on that first. But the two things are not in conflict. Suspense works best if you data is prefetched. Otherwise you still run into the problem and ineefficiency of going through two renders (first errorred out - no data, second good), just without the intermediate state being visible to the user. And there's other advantages of being able to run things synchronously (some of them listed in the post above).

Right now, my main focus is performance. But if I can also get rid of flickering months in advance of Suspense being production-ready, while preparing the framework to take the best possible advantage of it — great!

kilbot commented 4 years ago

I should admit that I have a couple of biases as well:

Database queries (and filtering, sorting etc) 'feel' like they are expensive, so it 'feels' right that they are non-blocking. But real world performance doesn't care about my feelings 😛
I didn't know anything about RxJS until I started using WatermelonDB ... but now I've started to like the operators and I've started incorporating it into other parts of my project, eg: for ajax requests.

Aside: If you have time I would be interested to hear a little more about your experience with RxJS and what you are using for async side effects like calls to the server.

Having said that, if I came to WatermelonDB fresh, without these biases, then I probably would have found a synchronous callback API much easier to pick up and use, so 🤷‍♂ ... perhaps bisynchronicity is the best of both worlds so long as it doesn't make maintenance of the library super confusing.

radex commented 4 years ago

I didn't know anything about RxJS until I started using WatermelonDB ... but now I've started to like the operators and I've started incorporating it into other parts of my project, eg: for ajax requests.

You shouldn't worry about that. The external API of Watermelon won't change and will still be Rx. This is about allowing Rx observers to get the initial value from DB synchronously, not just asynchronously

barbalex commented 3 years ago

asynchronicity is a problem when using watermelondb with hooks. Example:

import React from 'react'
import { useDatabase } from '@nozbe/watermelondb/hooks'
import { useObservableState } from 'observable-hooks'

import Herkunft from './Herkunft'
import ErrorBoundary from '../../shared/ErrorBoundary'

const HerkunftDataProvider = ({ id }) => {
  const db = useDatabase()
  const herkunft = useObservableState(
    db.collections.get('herkunft').findAndObserve(id),
    null,
  )

  // TODO:
  // findAndObserve can throw error
  // if url points to dataset but it's data was not yet loaded
  // can't await or catch the error above because is inside hook
  // need to catch it with ErrorBoundary
  return (
    <ErrorBoundary>
      <Herkunft id={id} row={herkunft} />
    </ErrorBoundary>
  )
}

export default HerkunftDataProvider

The trouble is: I can neither await the result of findAndObserve nor catch the error returned when no dataset is found inside the useObservableState hook.

Am patching this with an error boundary that returns null right now but that seems like a pretty bad hack.

likern commented 3 years ago

@radex 😄 Synchronous API is very important. I also experienced flickering. I'm using TypeORM which provides it's own Promise based API.

But, since it's not observable, I could overcome it. I'm using state and reactivity using Recoil JS. It allows easily subscribe to state updated in granular fashion.

Because it's embedded database and I fully control it I use "optimistic update". First I update state and right after that (in promise scheduled to execute later) calling TypeORM. I do not call then or await on promise result.

I did that by exactly the same reason - flickering, even on smallest possible request it's enough to see it. Very bad user experience. My approach works very well, but a lot of code is written manually.

I think this can be done in 🍉. Instead of treating Watermelon 🍉 as just database, add React state functionality like Redux.

If I insert value, for example, save it in memory and return immediately back to all subscribed components. And later do all the required heavy lifting to actually save data in database.

It makes requests asynchronous internally, but immediate to the user. For unknown data or first time fetching (where this trick will not work) - use Suspense. I tried it and now it works great with recoil js.

radex commented 3 years ago

@likern Hey, just use synchronous option on native and useWebWorkers: false on web to enable synchronous operation and avoid any flickering. No extra layer of state management required.

barbalex commented 3 years ago

@radex

to enable synchronous operation

What exactly does synchronous mean?

I suppose it does not mean that instead of:

const herkunfts = await db.collections.get('herkunft').fetch()

I could use:

const herkunfts = db.collections.get('herkunft').fetch()

?

Because that would be such a pleasure.

radex commented 3 years ago

What exactly does synchronous mean?

It means that Query observation resolves synchronously, so as long as you build your UI on top of Query observation (with withObservables or using .experimental* methods), all will be rendered in one microtask

I suppose it does not mean that instead of:

alas no.

henrymoulton commented 3 years ago

Hey @radex I really liked this write up, some really interesting stuff here! Your comments about moving from Promises to a callback API reminded me of a write up on optimising AsyncStorage: https://medium.com/@Sendbird/extreme-optimization-of-asyncstorage-in-react-native-b2a1e0107b34

The Promise pattern is another main cause of performance drawbacks to using AsyncStorage. According to our experimental control, we found that using Promise is costly compared to not using it. Our experiment shows that Promise leads to slower processing times even when the process doesn’t involve I/O operations. After purging Promise from the implementation and, instead, using callback, we achieved a 10–12x performance boost overall.

I'm interesting in learning a bit more about React Native profiling and thought perhaps with the complex path that you've mentioned:

the whole path, from JS, through V8/Hermes, JSI, our C++ adapter, to SQLite

is there anything I can do to jump in, learn and perhaps help with this?

One idea I had would be to start with maybe adding or updating examples?

henrymoulton commented 3 years ago

I also came across recently 2 projects that use JSI for data persistence

https://github.com/mrousavy/react-native-mmkv https://github.com/greentriangle/react-native-leveldb

henrymoulton commented 3 years ago

Kicked off an update to the native example https://github.com/henrymoulton/WatermelonDB/tree/fix/new-example/examples/native63

henrymoulton commented 3 years ago

Hey @radex I noticed https://github.com/mrousavy/react-native-multithreading by @mrousavy was released for 1.0 and thought that this might alleviate some of the complexity in enabling multithreading for WatermelonDB.

Curious to know if you think it might help.

I did read your thoughts here on it not being a silver bullet though!

Without prefetching, you're ordering data only when it's needed - and by that point… well… it's needed now. So you're not really getting a lot of benefit from parallelism. Our experience says that databases are fast, and React/React Native/DOM are slow. So you're adding a lot of overhead on main thread, while only moving the minority of work to a separate thread Without a strategy to avoid flickering, you're causing A LOT more rendering passes by asynchronous operation, which are expensive.

henrymoulton commented 3 years ago

I think there's also some discussion about the state of Prefetching - is it worth adding some Docs? https://gist.github.com/radex/9759dc1ea23a25628b80ed06f466264f is 3 years old now, perhaps I can look into Prefetching https://github.com/Nozbe/withObservables/issues/10 ?

radex commented 3 years ago

@henrymoulton rn-multithreading is very cool but I'm not sure if this is the right (or necessary, or sufficient) tool for 🍉. I currently plan to look into multithreading (in JSI adapter only) in the coming weeks/months - but I don't want to promise anything.

For all my use cases, it's only really an optimization, nothing ground breaking - all my profiles show that RN & JS is the bottleneck, not 🍉. But if you have an app where 🍉 really is a bottleneck, please send profiles from chrome/hermes/safari profiler

henrymoulton commented 3 years ago

Thanks that makes a lot of sense!

likern commented 3 years ago

@henrymoulton rn-multithreading is very cool but I'm not sure if this is the right (or necessary, or sufficient) tool for . I currently plan to look into multithreading (in JSI adapter only) in the coming weeks/months - but I don't want to promise anything.

For all my use cases, it's only really an optimization, nothing ground breaking - all my profiles show that RN & JS is the bottleneck, not . But if you have an app where really is a bottleneck, please send profiles from chrome/hermes/safari profiler

@radex Hello! Do you have plans to separate out JSI bindings part of WatermelonDB? It would be awesome for me to be able to use work which is already done, instead of reinventing wheels.

I already use pure SQLite and would like to utilize your work - native bindings to SQLie through JSI. Am I correct that JSI bindings is something similar to https://github.com/ospfranco/react-native-quick-sqlite?

radex commented 3 years ago

@likern I have no such plans, but the native-JS interface is relatively stable. So you can add WatermelonDB to your project, but not import it in JS - only interface with JSI yourself

stale[bot] commented 2 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

Yarkhan commented 2 years ago

Suspense has been released https://reactnative.dev/blog/2022/06/21/version-069

PEZO19 commented 1 year ago

@radex regarding https://github.com/Nozbe/WatermelonDB/issues/576#issuecomment-746358899

all will be rendered in one microtask

Maybe obvious for others, but just want to make sure I get it correctly: does that also mean that:

"all will be RErendered in one microtask?"

Eg. when multiple Query "subscriptions" (of a screen) depend on the same set of tables(/collections?), so when these tables are updated, the Query subscriptions should emit "at the same time" (in same microtask?) to have consistent data on the React layer.

Nozbe / WatermelonDB