Cardinal-Cryptography / AlephBFT

Rust implementation of Aleph consensus protocol
https://cardinal-cryptography.github.io/AlephBFT/index.html
Apache License 2.0
53 stars 28 forks source link

A0-4153: Factor unit validation out of runway #429

Closed timorleph closed 5 months ago

timorleph commented 5 months ago

This required quite a lot of other changes as well, so this is quite large. It also looks... not ideal and in places more convoluted than beforehand, but completely untangling all this in a single commit would require it to be even larger.

github-actions[bot] commented 5 months ago

Please make sure the following happened

woocash2 commented 5 months ago

This PR is very large. It would be very helpful if you included a thorough description of what happens and some guides for the reviewers in the PR description e.g.

Some of the above may be irrelevant, but just to give a picture of what I would be happy to see. Since the PR is so big, the review time will also be long, but a thorough description would speed it up 3x :stuck_out_tongue:

timorleph commented 5 months ago

Ah, fair enough.

As the title implies the main goal of this refactor was to get unit validation logic out of Runway. The basic validation was already contained in units::Validator, but the rest, i.e. fork detection and handling, was now moved into validation::Validator, which should be the main focus of this review (plus the changes in Runway). This required several more changes to get to work, plus I made a couple possibly strictly speaking unnecessary changes of things that kept getting in my way and it was easier to fix them rather than keep tripping over them.

Perhaps the most important additional change is the complete rewrite of units::Store. In fact its purpose is now very different, so much so, that it's probably better to just read the file rather than trying to look at the diff. In particular it no longer keeps explicit track of forks, that responsibility has been moved to validation::Validator. Since this store is completely fresh, the whole file is also an important target of review.

To make the new abstraction for the unit store work units had to be abstracted away, rather than transformed. This is perhaps the biggest difference in what is actually happening during unit processing. Before, units were saved after validation in Runway and only stripped information about them was sent on to reconstruction and later extension. Now, they are sent into reconstruction directly, without saving (or rather only saved in validation::Validator for fork detection purposes), where they end up being encapsulated with explicit parent information. Such encapsulated units are then saved in a Store in Runway, and still only stripped information is eventually sent to extender (this will likely change in future parts of this refactor though). In particular reconstruction now depends on a Unit trait, which was a struct before. This trait, and the related WrappedUnit, should also be reviewed.

This change in abstraction necessitated some further changes again, but most of them pretty trivial – most changes related to this in creation and reconstruction should at most be skimmed through. One change that was not trivial here though was moving the passing of units to extension from reconstruction to runway. This is also why extension moved from consensus to runway and why one of the Notification[In|Out] got deleted – I removed the second one for symmetry, as it was mostly confusing at this point. This is also why there are a couple more channels now, they pass through the, previously internal, variants from the Notification* types – this is what I referred to as "more convoluted" in the description above, and I'm already working on removing the complication in the next part of the refactor. In particular consensus will almost surely be merged with runway in the next PR (possibly under the former name?), since creation will also move.

A couple smaller changes, that perhaps could have been avoided:

  1. creation no longer returns explicit parents, as it became painfully obvious that these are not necessary.
  2. parent responses are passed to reconstruction with a HashMap – I think this was actually needed in some way because of the abstraction changes, but I no longer remember why, so I'm not sure; this will definitely be better when implementing the "more parents" task though.
  3. runway status reports had to be somewhat rewritten, but they kept most of their functionality
  4. tests (in testing) got quite impacted, I even removed one of them (it was kinda duplicated by others fortunately) – it should be sufficient to just skim through them though

Phew, that's quite a lot, if you still have questions I'm available for a call, just send me a DM.