Perf Discussion -- [original title: Support `PRAGMA journal_mode=WAL;`?]

AlexErrant commented 9 months ago

Using the latest https://github.com/vlcn-io/live-examples as my playground, I added

  console.log(await db.execO("PRAGMA journal_mode=memory;"));

which correctly yields [ { "journal_mode": "memory" } ]

However,

console.log(await db.execO("PRAGMA journal_mode=WAL;"));

yields [ { "journal_mode": "delete" } ]

Notably, PRAGMA journal_mode=WAL; works on https://rhashimoto.github.io/wa-sqlite/demo/

If it matters, I'm just trying WAL mode because it apparently may improve perf.

rhashimoto commented 9 months ago

Notably, PRAGMA journal_mode=WAL; works on https://rhashimoto.github.io/wa-sqlite/demo/

It works in the wa-sqlite demo if you use the default in-memory VFS, but not with any other VFS. WAL requires a shared memory implementation in the VFS to work, which no wa-sqlite example VFS has.

It's possible to implement the shared memory calls (I think) but (1) I don't know if sharing via SharedArrayBuffer or message passing will be particularly performant, and (2) no browser persistence API currently allows simultaneous reads and writes that maximize WAL benefits (though there's a proposal to change this for OPFS).

See also this SQLite forum thread.

AlexErrant commented 9 months ago

Ulgh I just kinda ignored Unix / Standard because well... why would I be on anything different... and of course that would indicate the root problem.

Thanks for the info dump! I look forward to OPFS becoming tenable for wa-sqlite someday.

rhashimoto commented 9 months ago

I look forward to OPFS becoming tenable for wa-sqlite someday.

wa-sqlite does have two example OPFS VFS's, OriginPrivateFileSystemVFS and AccessHandlePoolVFS. They are both in the demo and benchmarks pages. If you're after performance, AccessHandlePoolVFS is the fastest wa-sqlite VFS on those benchmarks except for write transaction overhead.

tantaman commented 9 months ago

@AlexErrant -

I've been brainstorming with the ElectricSQL folks (here: https://discord.com/channels/933657521581858818/1157258041318703114) on how to improve perf for reactive/live queries.

One idea is that live queries can be optimistically updated synchronously without round-tripping to SQLite. Obviously there's tons of gotchas here, a few being:

Triggers won't have run before making the optimistic update to live results
Constraint checks won't have run
The transaction could fail in SQLite, requiring optimistic updates to be rolled back

Another idea is to compile an extension into SQLite which knows of all currently active live queries. On write, we'll update all live queries within the WASM module itself without having to make round-trips from WASM to JS for each live query.

The second idea is the most practical, doesn't come with any gotchas and likely will get us most of what we need.

Outside of those, I'm working on a new project to address this stuff from the ground up.

tantaman commented 9 months ago

@AlexErrant - practically speaking though, what sort of perf issues are you hitting? Maybe there's some simple things we can do now.

AlexErrant commented 9 months ago

Hm, I don't really follow the first part of your comment; I'm not currently using reactive/live queries (though perhaps in the future! I'm using SolidJS, after all.)

In my app there's an MS Excel-style spreadsheet that may have 10k+ rows that a user may search/sort on; so we virtualize it and query for data in blocks of 100 rows. Being Excel-style, you can arbitrarily scroll any distance down the spreadsheet. OFFSET has terrible perf when skipping 15k+ rows, and users might want to jump to the bottom of the spreadsheet. I handle this in two steps:

Return the first 100 results of a query ASAP.
In a nonblocking manner, add all the relevant ids for that query to a temp table.

When a user tires of the first 100 results and starts randomly scrolling, I can use cursor pagination with the temp table's rowid (WHERE rowid > 15000 ORDER BY rowid LIMIT 100) by taking advantage of the fact that the spreadsheet's row number is now the same as the temp table's rowid. Building this cache is relatively expensive; it takes ~10 seconds on my devbox with ~15k rows. However jumping to the bottom of that result set takes ~200ms, so I'm happy with it.

The problem occurs when a user sees something interesting in the first 100 results and clicks it, generating a read query to get details. The cache write is blocking the detail query. I wondered if WAL mode would be an easy way of getting concurrent read/writes, hence this issue.

This isn't blocking me, so please don't feel obliged to make this issue any kind of priority; there's a reason I closed it :) A workaround is to build the cache incrementally... which I'll happily procrastinate on until I hate myself enough to take action (or W3C TAG solves my problem, which I give a 0.000001% chance of happening).

rhashimoto commented 9 months ago

The problem occurs when a user sees something interesting in the first 100 results and clicks it, generating a read query to get details. The cache write is blocking the detail query. I wondered if WAL mode would be an easy way of getting concurrent read/writes, hence this issue.

I don't think that WAL mode would be a big win here. IIUC both your cache query and your user-initiated follow-up query only read from the main database, so this usage only benefits from concurrent reads, not concurrent reads and writes. Writing to a temp table doesn't change this.

You can get concurrent reads with wa-sqlite, but two things need to be true, both of which I'm guessing are currently not true:

You must execute each read transaction from a separate WebAssembly instance, i.e. wa-sqlite must be initialized multiple times, ideally using separate JavaScript contexts. This is because WASM SQLite doesn't allow multiple calls to be in flight simultaneously in the same WASM instance.
You must be using an IndexedDB VFS that supports shared read locks. I believe that vlcn currently uses the stock IDBBatchAtomicVFS class, which uses only exclusive locks.

Your plan to build a cache with multiple smaller queries that you can preempt sounds like a reasonable alternative approach.

Of course, the catch-all tactic to improving SQLite performance is to increase the page cache size if you haven't done that already.

@tantaman Does cr-sqlite provide a way to stream query result rows? That seems like a generally useful feature in a higher level API.

tantaman commented 9 months ago

Does cr-sqlite provide a way to stream query result rows? That seems like a generally useful feature in a higher level API.

@rhashimoto - I currently don't expose step but that would be a good thing to expose.

OFFSET has terrible perf when skipping 15k+ rows

@AlexErrant - Yeah, offset always ends up doing a scan the size of the offset. A better alternative is to use a cursor that is comprised of the column being ordered by and the primary key which you're doing with your cache. The cache is a clever workaround for scrolling through arbitrary query results.

it takes ~10 seconds on my devbox with ~15k rows

That seems odd. I'd expect subsecond time to populate an in-memory temp table, even for 15k rows. Have you tried it in the native SQLite CLI to see what kind of perf you get?

E.g.,

-- .timer on
create table foo (a primary key, b, c);
insert into foo with recursive cte(a,b,c) AS (SELECT random(), random(), random() UNION ALL SELECT random(), random(), random() FROM cte LIMIT 15000) SELECT a,b,c FROM cte;
-- Run Time: real 0.025 user 0.024497 sys 0.000232
create temp table bar as select * from foo;
-- Run Time: real 0.006 user 0.005454 sys 0.000305

rhashimoto commented 9 months ago

I currently don't expose step but that would be a good thing to expose.

Exposing step is sort of a lower level approach. I was thinking that supplying a ReadableStream source, i.e. the object you pass to a ReadableStream constructor, for output rows might be a nice way to go.

The only API you would need to define is how to create the source - after that everything is specified by the standard Streams API. So there's not much external design and documentation to do, and then you get Streams API features for free, like buffering, back pressure, and async iteration.

vlcn-io / js

Perf Discussion -- [original title: Support `PRAGMA journal_mode=WAL;`?] #27