ethereum / cbc-casper

GNU Affero General Public License v3.0
228 stars 44 forks source link

Restructure Codebase #175

Closed naterush closed 6 years ago

naterush commented 6 years ago

Issue

The codebase is not currently optimized for readability and extensibility. There are currently multiple classes that do the exact same work (or no work at all) like test langs, simulation runner, and protocol. In general, would be nice if we could crunch all of these into one - which would make the codebase much more understandable and comprehensible. Yay!

Proposed Implementation

Overview

For each protocol, create a XProtocol class (for example, BlockchainProtocol). Note that this is different than the current BlockchainProtocol class that exists today.

The protocol class is instantiated with a JSON object that specifies the execution of the entire protocol. Here's an example JSON object for a concurrent schedule replication:

{
    "protocol": "concurrent",
    "config": {
        "validators": [10, 11],
        "starting_outputs": [123, 1234, 12345],
        "genesis_estimate": [123],
        "select_outputs": "random",
        "create_outputs": "random"
    },
    "execution": {
                 "msg_per_round": 2,
        "execution_string": "M0-A M1-B"
    }
}

Obviously, each protocol would have a required set of fields, as well as validity conditions about the JSON object (reasonable things like len(initial_estimates) = len(validators)). See at end of post for spec of each JSON object. The goal is that the entire execution is totally deterministic from the JSON object.

This can then be run with protocol.execute(), which essentially consumes the execution_string, printing/making GIFs where appropriate.

Displayz

Each XProtocol class take three (optional) other than the setup string: interval, display, save. interval defaults to if not specified, and display and save both default to False.

The interval is measured in rounds where one round is msg_per_round number of messages being created. If display == True, interval == 3, and msg_per_round = 1, then a viewgraph will appear after every 3 messages are created. This makes it simple to display viewgraphs in cases like full message propagation: msg_per_round = len(validators).

Thus, depending on the message default being used, msg_per_round changes: (random, 1) (rrob, 1) (full, len(val_set)), etc.

Initial messages

Currently, there is some weirdness is how initial messages are created and dealt with. The new proposed method is essentially adding a method add_initial_messages to each view (or validator), and removing the instantiation of view with messages.

The reasoning for this is that a validator's view is created when the validator is, but some initial message schemes require all of the validators to be created before creating the initial messages, so we must treat these as two separate steps.

It looks like this:

validator_set = ValidatorSet(...)
initial_messages = self.get_initial_messages(validator_set)
for validator in validator_set:
   validator.set_initial_messages(initial_messages[validator])

It might make sense to push the set_initial_messages functions down to the view only, and not have the validator have access to this. Not sure what is best here.

The benefits to this are simple: each protocol can now do initial messages in its own way. Sharding has multiple genesis messages (one for each shard), and deals with this in a strange manner currently. This new approach would resolve this.

execution_string

Now, we must specify how to generate execution strings. This is a bit more complicated than it initially looks; consider how to handle different networks as we currently have.

However, I think we can essentially 'rip out' our existing message generation and networking stuff, to create some interesting ExecutionGenerator classes. Some of these could be RandomExecutionGenerator, RrobExecutionGenerator, etc (these are essentially the message generator classes that exist today.

Note: each of these classes has knowledge of how long a "round" is for itself. This is how we get the round number described above that is used when displaying (it can return this along w/ the execution string).

Furthermore, we must be able to specify networks that these classes can use to figure out when messages actually get delivered. However, this is fairly easy; because we can have the protocol string be generated round by round (some type of loop makes the most sense anyways), we can have messages, upon creation, check a delay function for how many rounds later they should be delivered.

For example (this is an example of full message propagation, but you get the idea)

execution_string = ''
deliver_on_round = dict()

for round in num_rounds:
   for creator in num_validators:
      message_name = self.get_new_message_name()
      new_message = 'M' + str(creator) + '-' + message_name + ' '
      execution_string += new_message

      for receiver in num_validators:
         delay = self.get_delay(round, creator, receiver) # include the round so delay can change over time

         if not deliver_on_round[round + delay]:
             deliver_on_round[round + delay] = []
          deliver_on_round[round + delay].append('S' + str(receiver) + '-' + message_name  + ' ')

     for send_message in deliver_on_round[round]:
        execution_string += send_message

Thus, we can carry over the same delay functions that we have currently to get the effect of the network!

Note that we can also store a max_round variable, and require that all messages are delivered before this max round.

Misc.

protocol.execute() takes an optional string. If you call protocol.execute(s) then protocol.execute(t) then you would be left at the state gotten to via json_string + s + t.

Conclusion

The benefits of this approach:

JSON objects for different protocols

Note that throughout each of these protocols, reasonable rules apply to the execution strings: you can’t send messages before they exist, and the creator of a message must be a real validator.

Binary

{
    "protocol": “binary”,
    "config": {
        "validators": [10, 10],
        "initial_estimates": [0, 1],
    },
    "execution": {
                 "msg_per_round": 2,
        "execution_string": "M0-A M1-B"
    }
}

Integer

{
    "protocol": “integer”,
    "config": {
        "validators": [10, 11]
        "initial_estimates": [100, 6],
    },
    "execution": {
                 "msg_per_round": 2,
        "execution_string": "M0-A M1-B"
    }
}

Order

{
    "protocol": “order”,
    "config": {
        "validators": [10, 11],
        "initial_estimates": [
            [“cow”, “pig”, “mouse”, “rat”],
            [“pig”, “cow”, “mouse”, “rat”]
        ],
    },
    "execution": {
                 "msg_per_round": 2,
        "execution_string": "M0-A M1-B"
    }
}

Blockchain

{
    "protocol": “blockchain”,
    "config": {
        "validators": [10, 11],
    },
    "execution": {
                 "msg_per_round": 2,
        "execution_string": "M0-A M1-B"
    }
}

Concurrent Schedule

{
    "protocol": "concurrent",
    "config": {
        "validators": [10, 11],
        "starting_outputs": [123, 1234, 12345],
        "genesis_estimate": [123],
        "select_outputs": "random",
        "create_outputs": "random"
    },
    "execution": {
                 "msg_per_round": 2,
        "execution_string": "M0-A M1-B"
    }
}

Sharded blockchain

{
    "protocol": “sharded”,
    "config": {
        "validators": [10, 11],
        “num_shards”: 2,
        "val_select_shards”: [“random”, “random”]
    },
    "execution": {
                 "msg_per_round": 2,
        "execution_string": "M0-A M1-B"
    }
}

In the future, we can add:

"validators_shards”: [
            [“”, “0”],
            [“”]
        ],

which a) must have shard ids for the number of shards, and b) has validators go through very specific state transitions. This can be used with some val_select_shards rules.

drewstone commented 6 years ago

How are you currently stepping through time? If there's a well-defined grid step, you could give each validator a validator_delay and at each step check if current_grid_step >= v.validator_delay. If so, poll its next message, etc.

Sample validator delays from some arbitrary distribution (uniform) over the [min,max] you want.

drewstone commented 6 years ago

You also (maybe?) could add the estimator.estimate logic into the each XView's class. Since it seems all you're doing in each view is importing that function defined in a separate file.

djrtwo commented 6 years ago

I think this is a good approach and agree that it will clean up the codebase and testing. Some initial thoughts. Sorry for the poor organization. Let's hop on a call sometime in the next couple of days.

Bigger design notes/questions:

Minor notes:

drewstone commented 6 years ago

@naterush also correct me if I'm wrong but it looks like there is a single protocol view that all validators keep track of. This means that all validators always have the same exact view. Do you have plans to let each validator have a potentially unique view of the chain state?

djrtwo commented 6 years ago

@drewstone This is already implemented.

Network keeps track of a global_view that we use for gathering data about all messages, but each validator stores their own view in which they can only see messages that they have received and can only make judgements from such.

Sometimes the output/viewgraphs is just showing us info from the global view so we can get a global perspective on the network.

naterush commented 6 years ago

Thanks for the great feedback. I've updated the post above to consider most of these comments.

I think that a lot of the complexity described above was a result of not having rounds (which it turns out - though they seem unnecessary, are fairly necessary). As such, I've changed the above description to have rounds again; the key here is that we only really need them when a) creating the execution string, and b) calculating when to display (if we are displaying something). Reintroducing them seems to solve all the problems, but let me know if you see any other issues.

Also, most of the minor notes seem reasonable to me. I've justified some decisions (like the separate set_initial_messages function) in the original post.

naterush commented 6 years ago

There's one more thing I want to make sure we can handle, but I'm not exactly sure how to do it. Essentially: we need to control how initial messages are propagated.

For example: currently, we send all initial messages to everyone at the start of a binary protocol, but this leads to all messages have the same estimate for the rest of time (b/c everyone has seen everything), which isn't very interesting.

So we need more control over the execution over how these initial messages are propagated. There are a couple solutions here:

  1. For some protocols, enforce that the execution strings starts with "M0-A M1-B M2-C..." as a validity condition (an error will be thrown if it doesn't start with this, for say, binary). A, B, C... are the names of the initial messages for each validator. The issue with this is that it the requires special cases in the protocol execution string generator.
  2. Automatically name each initial message from the validators, but don't include them in the execution string. Note that this still requires special cases in the execution string generator string still (we have to send those initial messages still), but we lose the clarity of actually being able to see this in the execution string.
  3. Do what we are doing currently, and send all initial messages to all validators before running the execution string. I personally don't like this solution b/c it leads to protocols like binary, list, and order all having super boring executions where everyone has the same estimate the whole time.

Personally, I think 1 is the cleanest solution. Any other thoughts/suggestions are greatly appreciated.

naterush commented 6 years ago

So I figured out a way to actually handle the special case in a general manner, without doing much extra work.

Essentially: the first time a validator 0 sends a message to validator 1, they send it in a justified manner (by including all the other blocks they sent as well).

djrtwo commented 6 years ago

Rounds in string generators with passing some info along with the execution string (length of round) is a good trade off :)

Notes:

djrtwo commented 6 years ago

As for your fix on the sending of initial messages..

Are you proposing that when Val-0 (whose initial message was A) sends its first message (B) to Val-1, that both A and B are attempted to be propagated to Val-1? Are you proposing that:

naterush commented 6 years ago

@djrtwo I am suggesting the first. Essentially, we insist that the first time Val-0 sends to Val-1 (say they are sending message B), we use SJ1-B rather than S1-B.

naterush commented 6 years ago

I think that sending in the length of the round is a nice solution - as well as leaving display and save out. I'll update the spec with this now.

Agree on defaulting to one - especially in the case of more complex strategies.

naterush commented 6 years ago

One final change: I'm proposing some changes to the testing language so it's more extensible for any possible future protocol executions we might want.

There are two base commands: M, S, where M is making a new message and S is sending an existing message. Furthermore, there is the secondary command SJ, which note, can be decomposed to the just S commands.

M-0-A means Val-0 make a new message called A. S-1-A means send Val-1 message A. SJ-1-A means send Val-1 message A, as well as all messages in A's justification.

Other validity conditions:

Furthermore, we also add the following . The base make command is M-0-A (for example), but we also have M-0-A-(), where () can contain any arbitrary information about the new message.

For example, we could have M-0-A-(J-{B,C,D}), where this command essentially says create a message called A that just has the messages B, C, and D in its justification.

The regex for matching these commands is: ([A-Za-z]*)([-]*)([0-9]*)([-]*)([A-Za-z0-9]*)([-]*)([(){A-Za-z0-9,}]*)

djrtwo commented 6 years ago

Does that mean make block F forgetting that I made block E? Or is it making block F explicitly with E as estimate?

On Tuesday, March 20, 2018, Nate Rush notifications@github.com wrote:

Last thing I'd like to add to the specification before we move forward: keeping the language as general as possible... will post on this in a day.

Side note: just figured out how to do an equivocation: M0-F(E-{A, B, C, D}), where the second part essentially says that it's an equiocation, and gives a set of messages are are included in the justification of the new message.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ethereum/cbc-casper/issues/175#issuecomment-374465272, or mute the thread https://github.com/notifications/unsubscribe-auth/ABXf-0GWrrjx-OMQzkN3wS29pdEfd1foks5tgH2TgaJpZM4Smr4x .

naterush commented 6 years ago

@djrtwo updated with explanation of new commands.

naterush commented 6 years ago

Merged!