spacemeshos / api

Protobuf implementation of the Spacemesh API
MIT License
15 stars 9 forks source link

Redesign layer status to match new consensus mechanisms #144

Open lrettig opened 3 years ago

lrettig commented 3 years ago

Currently a layer can have one of three statuses:

https://github.com/spacemeshos/api/blob/105249951c66561cfc52195433ecae9cd5a121ff/proto/spacemesh/v1/types.proto#L112-L116

These statuses no longer map to how our consensus mechanisms actually work. Here's a better, more accurate design:

We may not want to surface all of these possible statuses via the API, and this list is not precisely MECE as there is some overlap, but it's reasonably comprehensive.

Related: https://github.com/spacemeshos/go-spacemesh/issues/2403

avive commented 3 years ago

We also need to think about transaction statuses. It is my understanding that while in self-healing, no other data is canonical until the self healing is complete. So, a transaction in block which which is in a layer that is healing will also need to have a tentative state - perhaps it is healing or perhaps it is tentative. We need to carefully consider what's the minimum new set of possible states that will give users a clue regarding the state of a network but on the other hand not have too many states as these are very confusing even for technical people. And the states need to be for all mesh entities... not just layers.

lrettig commented 3 years ago

To be clear, transactions obviously do not have an independent status - they derive their status from the status of their block and layer.

while in self-healing, no other data is canonical until the self healing is complete

What makes self-healing complex, in this context, is that it can invalidate a previously valid block (or vice-versa). So we could have blocks (and transactions) that are "approved" and applied to state, then reverted later. That's why I suggested introducing a "final" status, but we'll have to discuss with @tal-m the threshold beyond which we could apply this.

avive commented 3 years ago

We need to refine this and find a minimal MECE set. For example, why do we need unspecified if we have pending? Obviously we need to find a balance between being descriptive and informative and not confusing users with too many states. I think 7 is the magic number here that above it most people will the states just overwhelming and overly complex. For example, if stuck is a temporary possible state then it can also be pending. One thing to consider is to have all proposed states above until verified by tortoise to be pending and maybe provide more detailed hare-related status in the debugging api service.

Here's a minimalistic proposal for 3 high-level states for layer, block and tx (same states for all 3 entities):

lrettig commented 3 years ago

why do we need unspecified if we have pending?

This is a quirk of how GRPC works (and golang) - there needs to be a default value other than pending so we know whether or not that value has been initialized correctly. It doesn't need to be exposed to the user (if it is, that's a bug).

avive commented 3 years ago

So how about:

 enum LayerStatus { 
     LAYER_STATUS_UNSPECIFIED = 0; // unknown
     LAYER_STATUS_PENDING = 1;       // not yet approved or confirmed 
     LAYER_STATUS_APPROVED = 2;   // approved by hare 
     LAYER_STATUS_VERIFIED = 3;       // approved by tortoise 
     LAYER_STATUS_CONFIRMED = 4; // confirmed by tortoise and state applied
 }

So each state is additional confidence in confirmation compared to the one before it and the last one is the max level of confirmation we have in our system. We still have the question regarding can a verified layer move to pending due to self healing or not.

lrettig commented 3 years ago
avive commented 3 years ago
lrettig commented 3 years ago

Discussed this with @tal-m today: regarding "final", we have no explicit finality. Finality will be implicit, subjective, and probabilistic, as in Bitcoin. So I think we can drop this status.

avive commented 3 years ago

So after thinking more about this, maybe we go with these high-level layer (and transaction) statuses:

enum LayerStatus { 
     LAYER_STATUS_UNSPECIFIED = 0; // unknown
     LAYER_STATUS_PENDING = 1;       // not yet approved or confirmed 
     LAYER_STATUS_APPROVED = 2;   // approved by hare 
     LAYER_STATUS_VERIFIED = 3;       // approved by tortoise 
     LAYER_STATUS_CONFIRMED = 4; // confirmed by tortoise and state applied for txs in the layer
 }

and have additional sub-statuses regarding hare in lower-level api such as debuggingServices if needed for tests.

lrettig commented 3 years ago

Add invalid to the list and I will agree with you :)

avive commented 3 years ago

Add invalid to the list and I will agree with you :)

How is it different from LAYER_STATUS_UNSPECIFIED?

lrettig commented 3 years ago

Add invalid to the list and I will agree with you :)

How is it different from LAYER_STATUS_UNSPECIFIED?

I explained here. Individual blocks can be invalidated by hare or by tortoise. An entire layer can also be invalidated, e.g., if hare fails completely for that layer, which means that all of the blocks in the layer are marked invalid. Technically we can "verify" or "confirm" an empty layer, so I guess maybe we don't need a separate INVALID status. Do we need an EMPTY status? It can be implied by the nonexistence of any block data in the layer, as long as downstream clients know how to interpret and display empty layers.