vaticle / typedb

TypeDB: the polymorphic database powered by types
https://typedb.com
Mozilla Public License 2.0
3.72k stars 337 forks source link

Database thing statistics #7064

Closed flyingsilverfin closed 1 month ago

flyingsilverfin commented 1 month ago

Usage and product changes

We implement the architecture and most of the implementation required for tracking data statistics. The statistics will primarily be used for query planning. We achieved several design goals:

1) Not scanning the entire storage to update statistics 2) Allowing access to old versions statistics, which allows even very old time travel/MVCC usage without too much performance degradation 3) Not writing statistics to the storage layer in RocksDB, since we can degrade performance by updating the statistics keys on every transcation and statistics can/should be a primarily in-memory structure.

However, we take the trade off that the statistics are not always up-to-date. The update frequency is parameter we can optimise.

In the end, there is a single database-wide statistics struct, which is immutable and updated periodically. We update it by scanning the data WAL records and summing the count deltas since, and then replacing the Statistics struct held by the database atomically. The statistics are checkpointed into the WAL, which also allows us to time-travel to older snapshots and find a relatively accurate statistics entry from near that version.

This means we are solidifying the requirement that WAL cleaning, and MVCC compaction are tied to the same time-scale - both are required for going back to previous data versions correctly.

Future work We could find that reading from the WAL to update statistics is a bottleneck. We can solve several problems at once, by extracting the commit data "cache" from the IsolationManager into the DurabilityClient, which can then be shared across isolation and statistics operations.

Implementation

Architecture

This means we are solidifying the requirement that WAL cleaning, and MVCC compaction are tied to the same time-scale - both are required for going back to previous data versions correctly.

UX

We create a more consistent/comprehensive error structure for what happens if any of storage/wal/checkpoint are not present on bootup.

The presence or absence of the storage directory is irrelevant to bootup/recovery (same path). Being present simply optimises the recovery process since we have to copy fewer files from the checkpoints.

vaticle-bot commented 1 month ago

PR Review Checklist

Do not edit the content of this comment. The PR reviewer should simply update this comment by ticking each review item below, as they get completed.


Trivial Change

Code

Architecture