Closed carlsverre closed 9 months ago
First, SQLSync can trivially track which pages are changed by each mutation by coordinating with storage and the underlying storage journal. This gives us a "changed page set".
For now, we just want to know when a table has changed - not the precise pages that changed in it. So, we need to map the "changed page set" to a "changed root page set".
This can be achieved (in a sub-optimal way) using the dbstat table:
sqlite> select * from dbstat;
name path pageno pagetype ncell payload unused mx_payload pgoffset pgsize
------------- ----- ------ -------- ----- ------- ------ ---------- -------- ------
sqlite_schema / 1 leaf 4 192 3780 55 0 4096
foo / 2 internal 52 0 3644 0 4096 4096
foo /000/ 4 leaf 602 1204 1 2 12288 4096
foo /001/ 5 leaf 584 1168 0 2 16384 4096
foo /002/ 6 leaf 584 1168 0 2 20480 4096
foo /003/ 7 leaf 584 1168 0 2 24576 4096
foo /004/ 8 leaf 584 1168 0 2 28672 4096
foo /005/ 9 leaf 584 1168 0 2 32768 4096
foo /006/ 10 leaf 584 1168 0 2 36864 4096
foo /007/ 11 leaf 584 1168 0 2 40960 4096
foo /008/ 12 leaf 584 1168 0 2 45056 4096
...
But a better solution is to enable incremental vacuum which will cause SQLite to maintain ptrmap pages. Storage only needs to read these pages to fully map every changed page back to its root. This is much more optimal than dbstat which scans every page (although it does support name level filter pushdown).
Now that we know which table roots have changed we simply need to track which roots a query depends on. We can use explain to do this:
sqlite> explain select * from foo,foo2,bar where foo.id > 1;
addr opcode p1 p2 p3 p4 p5 comment
---- ----------- -- -- -- ------ -- ---------------------
0 Init 0 16 0 0 Start at 16
1 OpenRead 4 57 0 k(2,,) 0 root=57 iDb=0; foo_id
2 OpenRead 3 2 0 1 0 root=2 iDb=0; foo
3 OpenRead 2 3 0 1 0 root=3 iDb=0; bar
4 Integer 1 1 0 0 r[1]=1
5 SeekGT 4 15 1 1 0 key=r[1]
6 Rewind 3 15 0 0
7 Rewind 2 15 0 0
8 Column 4 0 2 0 r[2]=foo.id
9 Column 3 0 3 0 r[3]=foo.id
10 Column 2 0 4 0 r[4]=bar.id
11 ResultRow 2 3 0 0 output=r[2..4]
12 Next 2 8 0 1
13 Next 3 7 0 1
14 Next 4 6 0 0
15 Halt 0 0 0 0
16 Transaction 0 0 4 0 1 usesStmtJournal=0
17 Goto 0 1 0 0
Notice that the three OpenRead operations give us all the information we need to know that this query depends on changes to the btrees rooted at page 57, 2, and 3. (we can ignore table names).
Need to plumb query subscriptions into SQLSync core rather than hacking it at the top level using events.