feat: Add cascading db for dry runs

freesig commented 1 month ago

Dry running validate requires the block be in the database but we don't want to put the dry run block in the actual database. This allows adding the block to an in-memory database that cascades to the on-disk db if no state is found. fixes #114

freesig commented 1 month ago

Is the worry here anything other then performance? This is not something we currently provide and API for. If we do then it will probably be unavailable on the deployed nodes anyway so I'm not sure performance really matters or that this really is that slower then the other more complex approach.

The cascade approach allows use to use any sql queries we write in the future for free. This is equivalent to an transaction. When we write a block to the on disk db it first goes into an in memory tx. The only difference is this lives a bit longer so the memory is used a bit longer. The only other overhead is creating the tables but I really don't think that's very slow and it's constant overhead that's not linked to the block size.

The builder validation would not work for the validation stream so if we create another crate for validation it's only going to be shared between the dry run and the builder. Then we'd have to maintain two separate validation functions within the node which really need to have the same behavior.

I think unless we have evidence that this approach is drastically slow then it's not worth the extra complexity vs this very simple implementation. The other path this leaves open is the ability to switch to a sql db on disk that gets cleaned up after validation. For very large blocks on resource constrained devices this might be pretty useful.

mitchmindtree commented 1 month ago

The builder validation would not work for the validation stream

Partly what I'm suggesting is that it could work for the validation stream too.

We already read the whole block out just to iterate over the solutions for the node during validation - it'd be trivial to convert the current block being validated into a Mutations instance and then re-use all the same validation approach for solutions between the node, dry-run and builder.

I think unless we have evidence that this approach is drastically slow then it's not worth the extra complexity vs this very simple implementation.

My concern is mostly about having two separate implementations of almost the same thing in both the builder and node. My suggestion was mostly aimed at reducing complexity by avoiding adding the extra 200 lines here and doing some refactor to share validation logic with the builder.

The other path this leaves open is the ability to switch to a sql db on disk that gets cleaned up after validation. For very large blocks on resource constrained devices this might be pretty useful.

If this is something we want to support, it does rule out using the builder's approach.

The cascade approach allows use to use any sql queries we write in the future for free. This is equivalent to an transaction

I get the idea behind having a generalised transaction-like type around the connection pool with another in-memory connection pool and how it offers freedom to add more SQL queries more easily, I'm just not sure this is something we want to plan or optimise for when the only thing we currently use the DB for during validation of a block's solutions is querying state, and afaik we don't have any plans for adding more?

freesig commented 1 month ago

Partly what I'm suggesting is that it could work for the validation stream too.

We already read the whole block out just to iterate over the solutions for the node during validation - it'd be trivial to convert the current block being validated into a Mutations instance and then re-use all the same validation approach for solutions between the node, dry-run and builder.

This would involve creating the Mutations which is not needed for the validation stream. Also it would double the amount of memory for mutations because they would be pulled into memory for both the solutions and mutations. Also for really large blocks or resource constrained devices there's actually no reason to pull the entire block into memory as we could just pull batches of solutions or even parts of a solution if we have one big solution. If we take the Mutations approach then we would could build it up incrementally in memory as we validate solutions but you'd still need to have the whole thing in memory by the last solution. If we have one big solution then we would have to have all Mutations in memory to begin with.

My concern is mostly about having two separate implementations of almost the same thing in both the builder and node. My suggestion was mostly aimed at reducing complexity by avoiding adding the extra 200 lines here and doing some refactor to share validation logic with the builder.

I don't really think this is a big problem. It's only 200 lines and could probably be condensed if we used traits. I can look into reducing the size of this implementation. I think it's pretty reasonable to have two different implementations as the builder has pretty different priorities to dry runs. I think it's going to be pretty rare that devs will be dry running a large block. They are probably just validating a couple of solutions. If they really need the speed (say for a solver) they could use the builder directly or even write a custom function that suites them better.

freesig commented 1 month ago

Just added https://github.com/essential-contributions/essential-node/pull/121 as an alternative. This uses traits instead of enums. Not sure which is better

freesig commented 1 month ago

Issue fyi https://github.com/essential-contributions/essential-node/issues/123

essential-contributions / essential-node

feat: Add cascading db for dry runs #116