MinaProtocol / mina

Mina is a cryptocurrency protocol with a constant size blockchain, improving scaling while maintaining decentralization and security.
https://minaprotocol.com
Apache License 2.0
1.99k stars 529 forks source link

Document Mina daemon's databases #15767

Open georgeee opened 4 months ago

georgeee commented 4 months ago

Produce a document describing what are databases that Daemon uses.

Result of this task is expected to be a table of databases and descriptions.

Table's rows, I imagine, would be:

Relates to #13971

georgeee commented 4 months ago

This document is needed to analyze usage of RocksDB in block production/processing.

Major focus should be on frontier and ledger-related DBs.

georgeee commented 4 months ago

AFAIU all DBs we have are key-value storages.

And we probably have more than one key-value space within some of these DBs, let's have one "key-value space" per row in a resuting table.

To start with the task I suggest one launching a Mina node that connects to mainnet. And then checking what you have in .mina-config. Ideally we'd be able to inspect every DB with manual tool and check that there are no keys unaccounted, but it might be too tedious, so maybe just invetigating usages in codebase would be alright.

What I see on a mainnet's node:

Geometer1729 commented 4 months ago

root

The root/snarked_ledger location is defined here: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/transition_frontier/persistent_root/persistent_root.ml#L40 This path is given to Ledger.Db where Ledger is Mina_ledger.Ledger. So the snarked_ledger Db type gets defined here: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/mina_ledger/ledger.ml#L108-L120 which calls this functor: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/merkle_ledger/database.ml#L1

On this module: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/mina_ledger/ledger.ml#L89-L106

The Kvbd.t type it takes seems to be a generic database type with Bigstring.t for both key and value. But from calls to Kvdb like this: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/merkle_ledger/database.ml#L101-L102 It looks like the keys represent serialized Location.ts which I think is this enum: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/merkle_ledger/location.ml#L54-L55

and the value appears to be the Toeken_id.Set.t type from the inputs. I'm not sure where Token_id comes from, there are a few opens at the top of the file.

The root/root location is defined here and seems to store a hash of the genesis state.

Geometer1729 commented 4 months ago

trust

src/lib/trust_system/peer_trust.ml

Has the key as Peer_id.t which I believe refers to this: https://github.com/MinaProtocol/mina/blob/10a0bf9b8b5c27407b349f02ecaeea964e14690e/src/lib/trust_system/peer_trust.ml#L252-L256

and the value Record.t which I beleive reffers to this https://github.com/MinaProtocol/mina/blob/10a0bf9b8b5c27407b349f02ecaeea964e14690e/src/lib/trust_system/record.ml#L5-L12

Geometer1729 commented 4 months ago

frontier

It looks like this module: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/transition_frontier/persistent_frontier/database.ml#L212

calls this functor: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/rocksdb/serializable.ml#L53

With this argument: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/transition_frontier/persistent_frontier/database.ml#L29

I think that results in the key type being this private Enum: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/transition_frontier/persistent_frontier/database.ml#L49-L56

Here Root_data.Minimal.Stable.V2 refers to this type: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/transition_frontier/frontier_base/root_data.ml#L114

and I think this defines a type family like thing for what the value types is depending on the key: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/transition_frontier/persistent_frontier/database.ml#L74-L86

Geometer1729 commented 4 months ago

mina_net2/blocksdb

Based on the language server hovers in this code: https://github.com/MinaProtocol/mina/blob/4495af5caea5e1bb2f98f92592c065f93a586ade/src/lib/block_storage/block_storage.ml#L113-L118 It seems like the key here is Blake2.t and the values are Mina_block.Body.t I can't find where the path for this database is defined, so I'm just going by it being LMDB and the module name being appropriate.

Geometer1729 commented 4 months ago

genesis

This comment seems to confirm that the ledger here is the genesis ledger in rocksdb https://github.com/MinaProtocol/mina/blob/5dcc6777f10c940aae446cbb7877ca20edf0dcd9/src/lib/genesis_ledger_helper/genesis_ledger_helper.ml#L292-L308

The language server links that commit to this https://github.com/MinaProtocol/mina/blob/5dcc6777f10c940aae446cbb7877ca20edf0dcd9/src/lib/merkle_mask/masking_merkle_tree.ml#L617-L640 The database part of which seems to be Base.set_batch parent account_data ; Which again following the language server seems to come from here https://github.com/MinaProtocol/mina/blob/5dcc6777f10c940aae446cbb7877ca20edf0dcd9/src/lib/merkle_mask/inputs_intf.mli#L29-L40 I'm not entirely sure what's going on, but this seems to be pointing to the same code as the root/snarked_ledger. So I checked if the outputs looked comparable in the hopes they are the same types which would make sense I think given both databases represent snapshots of the ledger. The outputs are extremely similar last few lines of genesis_ledger:

$tid!0x3BD4E3A4BEC457DC36730A0295CB9542CF01AEDC6EC0F3A03240314649D90CF6 ==> rb5V̛,IA~:
$tid!0x3BD55C98C4DF679BF75D58287C49464CF382C94E0F493339E2C3AD996EEAF377 ==>
Q       1Hz>\~)4jW<8m
$tid!0x3BD611554C36EADB0C2FF4FD7831B035C1150534CE07515C901125CE1D7F82E8 ==> :o<p~e_Lg=($k<
$tid!0x3BD8498E090010ED1FCE99004B9D9E3B0A8E7C91F71B9283B333277236A54EFA ==> s-@_N6    R)pMI~
id!0x3BDA4F97891922FDEB5018F7F1C1C8756191778C3B07DDC685B4D9321263A305 ==> O3{W2~
"X"i;dV]        K6
$tid!0x3BDA9A6DFF6496A75B097EB7BB7949CB27A2374D505E4D1E3B44D032B4FEACA0 ==> 4)x|Au]y_6$fLa-
$tid!0x3BDAF9FF99C601137912EA4191D6EF2E3DCAD5A4F508B3584A8972CAC1A8B04D ==> fh0y)yUxP0

last few lines of snarked_ledger

$tid!0x3BD4E3A4BEC457DC36730A0295CB9542CF01AEDC6EC0F3A03240314649D90CF6 ==> rb5V̛,IA~:
$tid!0x3BD55C98C4DF679BF75D58287C49464CF382C94E0F493339E2C3AD996EEAF377 ==>
Q       1Hz>\~)4jW<8m
$tid!0x3BD611554C36EADB0C2FF4FD7831B035C1150534CE07515C901125CE1D7F82E8 ==> :o<p~e_Lg=($k<
$tid!0x3BD8498E090010ED1FCE99004B9D9E3B0A8E7C91F71B9283B333277236A54EFA ==> s-@_N6    R)pMI~
id!0x3BDA4F97891922FDEB5018F7F1C1C8756191778C3B07DDC685B4D9321263A305 ==> O3{W2~
"X"i;dV]        K6
$tid!0x3BDA9A6DFF6496A75B097EB7BB7949CB27A2374D505E4D1E3B44D032B4FEACA0 ==> 4)x|Au]y_6$fLa-
$tid!0x3BDAF9FF99C601137912EA4191D6EF2E3DCAD5A4F508B3584A8972CAC1A8B04D ==> fh0y)yUxP0

They are not actually completely identical per diff, but having several lines in common makes it feel like a safe bet the types are the same. So the key should be Location.t and the value should be Toeken_id.Set.t like in root/snarked_ledger

Geometer1729 commented 4 months ago

wallets/store/<..>

AFAICT there are no databases in here, just keys in individual files. Based on this it looks like the private keys are stored separately from public keys and the filenames of the private keys depend on the public keys. https://github.com/MinaProtocol/mina/blob/5dcc6777f10c940aae446cbb7877ca20edf0dcd9/src/lib/secrets/wallets.ml#L18-L30

Here it seems to be parsing the key but just reading plain lines from a file https://github.com/MinaProtocol/mina/blob/5dcc6777f10c940aae446cbb7877ca20edf0dcd9/src/lib/secrets/wallets.ml#L58

I was able to run a node with a wallet, but the wallets/store/ directory remains empty at least for an hour or two. I assume I would need to actually produce a block for this code to run. So I haven't been able to confirm that this is the actual behavior of a node.