Compress the clock table format

tantaman commented 1 year ago

We can vastly compress the table format (5x in most cases) by encoding all metadata for a row in a single clock row.

Encoding in a single row, however, loses quite a bit of expressiveness when selecting changes. An alternate route that still cuts quite a bit of fat is to move large values into lookup tables. See comment below.

tantaman commented 1 year ago

A simple first pass is to use lookup tables for large values rather than change the clock table structure.

Lookup table for site_id. Map the site_id to an int.
Lookup table for primary keys. Map the primary key to an int.
Use column index rather than name

pk_lookup

pks... | integer

siteid_lookup

site_id | integer

These lookup tables are local to each database. Real values are returned to send over the wire. We don't need a lookup for column index since both databases should have the same schema and same column indices. We will, however, map column index to column name via the table_info structures when doing insertions as there are various caveats around computed columns that require this step.

Size reduction:

16 byte primary key -reduced to-> 1 - 4 bytes 16 byte site id -reduced to-> 1 - 4 bytes 5-10 byte column name -reduced to-> 1 byte

42 bytes worst case -> 9 bytes worst case

Other structures can get us further compression be we may lose too much query power.

brainstorm alt structure:

pks | packed_col_versions | packed_db_versions | packed_site_id_refs | packed_seqs

tantaman commented 1 year ago

Doing re-insertion made me think a bit more about all the metadata being tracked. It looks rather excessive and I think we can cut it down a bunch by using lookup tables.

@jeromegn , @Azarattum - curious if either of you have thoughts on:

current metadata size and
the proposed lookup tables in the comment above

These changes should be internal only and transparent to the user except where column names are replaced by 1 byte column indices.

If someone has more than 256 columns in a table... well they'll be out of luck.

Azarattum commented 1 year ago

I was thinking of the ways to compress deltas when sending over the wire. The primary optimisation I came up with (apart from lookup tables) is to skip the same columns on concurrent changes. For example, when we change a bunch of stuff in a single table, we only mention the table name once and all the later changes will assume the previous name. This works great for transmitting changes, not sure how fast/reliable this approach will be for storing them.

This:	col	table	pk
likes	users	1	42
NULL	NULL	2	42
NULL	NULL	3	42
perms	roles	1	1337
rights	NULL	NULL	1337

is equivalent to this:	col	table	pk
likes	users	1	42
likes	users	2	42
likes	users	3	42
perms	roles	1	1337
rights	roles	1	1337

The major downside is that we rely on order. It is fine within a single sync packet, but might not be great for long-term storage. What do you think?

Azarattum commented 1 year ago

If someone has more than 256 columns in a table... well they'll be out of luck.

256 columns per table is fine, imo

tantaman commented 1 year ago

Skipping over cells with the same value is a good idea. Easy in the network code as you point out. For the persisted data.. I'd have to figure out how much this hurts performance as queries for changes would require scanning back several rows to find a value.

tantaman commented 1 year ago

We can actually encode column names as varints without creating a lookup table by leveraging the internal TableInfo data structure of cr-sqlite.

https://github.com/vlcn-io/cr-sqlite/blob/main/core/src/tableinfo.h#L14-L35

This poses some issues during schema migrations, however, since the indices in TableInfo will not be stable when users add columns to the middle of their table definitions.

To deal with this we will need to:

Record the mapping of column name to index at the start of an alter table
Check this mapping at the end of alter table
Re-write metadata for column names that got renumbered

I like this tradeoff since:

schema modification is already an expensive operation
it doesn't happen much compared to reads and writes of data
it don't impact read, write or merge performance

vlcn-io / cr-sqlite

Compress the clock table format #293