Open tantaman opened 1 year ago
A simple first pass is to use lookup tables for large values rather than change the clock
table structure.
pk_lookup
pks... | integer
siteid_lookup
site_id | integer
These lookup tables are local to each database. Real values are returned to send over the wire. We don't need a lookup for column index since both databases should have the same schema and same column indices. We will, however, map column index to column name via the table_info
structures when doing insertions as there are various caveats around computed columns that require this step.
Size reduction:
16 byte primary key -reduced to-> 1 - 4 bytes 16 byte site id -reduced to-> 1 - 4 bytes 5-10 byte column name -reduced to-> 1 byte
42 bytes worst case -> 9 bytes worst case
Other structures can get us further compression be we may lose too much query power.
brainstorm alt structure:
pks | packed_col_versions | packed_db_versions | packed_site_id_refs | packed_seqs
Doing re-insertion made me think a bit more about all the metadata being tracked. It looks rather excessive and I think we can cut it down a bunch by using lookup tables.
@jeromegn , @Azarattum - curious if either of you have thoughts on:
These changes should be internal only and transparent to the user except where column names
are replaced by 1 byte column indices.
If someone has more than 256 columns in a table... well they'll be out of luck.
I was thinking of the ways to compress deltas when sending over the wire. The primary optimisation I came up with (apart from lookup tables) is to skip the same columns on concurrent changes. For example, when we change a bunch of stuff in a single table, we only mention the table name once and all the later changes will assume the previous name. This works great for transmitting changes, not sure how fast/reliable this approach will be for storing them.
This: | col | table | pk | val |
---|---|---|---|---|
likes | users | 1 | 42 | |
NULL | NULL | 2 | 42 | |
NULL | NULL | 3 | 42 | |
perms | roles | 1 | 1337 | |
rights | NULL | NULL | 1337 |
is equivalent to this: | col | table | pk | val |
---|---|---|---|---|
likes | users | 1 | 42 | |
likes | users | 2 | 42 | |
likes | users | 3 | 42 | |
perms | roles | 1 | 1337 | |
rights | roles | 1 | 1337 |
The major downside is that we rely on order. It is fine within a single sync packet, but might not be great for long-term storage. What do you think?
If someone has more than 256 columns in a table... well they'll be out of luck.
256 columns per table is fine, imo
Skipping over cells with the same value is a good idea. Easy in the network code as you point out. For the persisted data.. I'd have to figure out how much this hurts performance as queries for changes would require scanning back several rows to find a value.
We can actually encode column names as varints without creating a lookup table by leveraging the internal TableInfo
data structure of cr-sqlite.
https://github.com/vlcn-io/cr-sqlite/blob/main/core/src/tableinfo.h#L14-L35
This poses some issues during schema migrations, however, since the indices in TableInfo
will not be stable when users add columns to the middle of their table definitions.
To deal with this we will need to:
I like this tradeoff since:
We can vastly compress the table format (5x in most cases) by encoding all metadata for a row in a single clock row.
Encoding in a single row, however, loses quite a bit of expressiveness when selecting changes. An alternate route that still cuts quite a bit of fat is to move large values into lookup tables. See comment below.