cartesi / dave

Cartesi fraud-proof system
Apache License 2.0
13 stars 11 forks source link

Dave Rollups Node #42

Open stephenctw opened 1 month ago

stephenctw commented 1 month ago

Based on design diagram: https://miro.com/app/board/uXjVKKcSTdw=/

stephenctw commented 1 month ago

@GCdePaula I figured it's not too hard to use sqlite in the StateManager (with the work Sofia has already built earlier), so I abandoned the InMemory version of it. Does the schema make sense to you? There're still some parts missing but I think we can fill in the blanks later as we keep working toward the node. What do you think?

1. Constants Table:
- key: TEXT, NOT NULL, PRIMARY KEY
- value: TEXT, NOT NULL
2. Epochs Table:
- epoch_number: INTEGER, NOT NULL, PRIMARY KEY
- sealed: INTEGER
- settled: BOOLEAN, NOT NULL
3. Inputs Table:
- epoch_number: INTEGER, NOT NULL
- input_index: INTEGER, NOT NULL
- input: BLOB, NOT NULL
- Composite PRIMARY KEY (epoch_number, input_index)
4. States Table:
- epoch_number: INTEGER, NOT NULL
- input_index: INTEGER, NOT NULL
- state: BLOB, NOT NULL
- Composite PRIMARY KEY (epoch_number, input_index)
5. Snapshots Table:
- epoch_number: INTEGER, NOT NULL
- input_index: INTEGER, NOT NULL
- path: TEXT, NOT NULL
- Composite PRIMARY KEY (epoch_number, input_index)
GCdePaula commented 1 month ago

Great work! It's progressing quite nicely :)

Some comments on the SQL:

1. Constants Table:

This is the only table I'm unsure about. Mainly because of this weak/loose typing. Let's discuss this one more carefully. In the diagram, we have:

Machine initial state snapshot
Block created
Contact Addresses

The machine initial state snapshot, could it go to the snapshot table maybe? I don't know. The contact addresses, does it make sense to have a dedicated table, like address book? Also, these configurations are added when the process starts, probably in a json/toml. Now I'm thinking if we need to replicate it at all. But I also like the idea of having them in table.

2. Epochs Table:

I think the sealed should also be NOT NULL. Like, this table should only contain epochs that have a boundary. Also, I think we can use longer names, like block_sealed or end_block.

3. Inputs Table:

The input index is the input index in epoch. I think we should add that to the name to be clear, and think about how we'll get that index in the blockchain_reader.

4. States Table:

I'd extend the name of this table to include something like "state", "hashes", "machine", "execution". I don't know, machine_state_hashes or execution_state_hashes or just state_hashes. I'd rename the column state to state_hash or something. Also, the input index is also in the epoch.


I've never used SQL much. Are there things where we tell SQLite that some indices are monotonic? Intuitively, it should speed up the queries. Tables where there are two primary keys which are indices, there might be a way to tell SQLite there's an order, by sorting first by one key and then by the other?

stephenctw commented 1 month ago

I agree that the Constants Table can be well supplied from the command line arguments or a configuration file. There's also a value that may not fit into any of the above tables, the latest_processed_block. I think I'm also lost about how we should use meta-counter.

GCdePaula commented 1 month ago

We can change the metacounter field of the snapshot to just epoch_number and input_number. To recover the metacounter from input_number we just do input_number << (a+b).

The metacounter contains info on "where we are": which input, which big instruction, which micro-instruction. This machine runner won't touch the uarch, so all those bits are zero. To simplify, we should also only take snapshots after a whole input has been processed, so all those bits are also zero.

Thinking now, we could even drop the metacounter on this part of the code, it's truly only needed in the dispute resolution.

GCdePaula commented 1 month ago

I think we should skip the full state snapshots for now and leave it for later.

GCdePaula commented 1 month ago

I think we can delete - settled: BOOLEAN, NOT NULL column too. Not sure.

stephenctw commented 1 month ago

I think I need some help on the machine-runner/machine-bindings front. I'm not sure if the break/yield reasons are handled properly.

GCdePaula commented 1 month ago

Ok! I've been working on PR #28, and I think I might have changed how this is handled. I'll clean it up and push it.