tendermint / spec

131 stars 56 forks source link

PBTS: association between timely predicate and timeout_commit #370

Closed cason closed 2 years ago

cason commented 3 years ago

With PBTS, proposed values (for the first time) can only be accepted when they are timely. This requires comparing the proposal time with the time at which the proposal is received by a validator. One of the rules, for rejecting proposal times too much in the past, assumes a maximum propagation delay MSGDELAY for Proposal messages.

In the original PBTS specification, the MSGDELAY parameter, together with ACCURACY, was also used to compute a minimum duration for the propose round step timeout. The rationale is that a validator should not timeout, thus prevoting nil, while a timely proposal could still be received.

Several internal discussions occurred after this specification was compiled, and some preliminary decisions are not in line with this association between timeout_propose and MSGDELAY parameters.

A first reason is the decision to evaluate the timely predicate upon receiving a Proposal message, not upon receiving the full proposed value, carried by BlockPart messages. While the timeout_propose duration should encompass the maximum delay for receiving the full proposed value/block, MSGDELAY only considers the maximum delay for a Proposal message.

A second reason is the fact that MSGDELAY is a parameter added to the proposal time field, i.e., its timestamp. So, the fact that a process enters a round before or after, in terms of real time, the proposer does that is not reflected by this parameter. But this lack of synchronization between processes should be taken into consideration when configuring the timeout_propose.

So, in particular while we consider MSGDELAY a fixed parameter, that is not incremented when rounds of consensus fail, I propose to remove this association between timeout_propose and MSGDELAY.

cason commented 3 years ago

As mentioned in #371, MSGDELAY should be a really conservative estimation of the maximum propagation time for a Proposal, only being violated when the proposer deliberately assigns timestamps further in the past.

Making then this conservative parameter as a minimum duration for timeout_propose could have bad implications in terms of performance.

williambanfield commented 3 years ago

@josef-widder @cason I'm not certain that this was resolved. If it was, can either of you post the outcome to this issue? As far as I can tell, we need to ensure that the validators wait at least as long as an honest proposer does, which would imply a relationship between timeout-propose and the waitingTime value. Perhaps we want to make this implicit or think that waitingTime > timeout-propose will be true so rarely that it's not worth including in the spec.

A second reason is the fact that MSGDELAY is a parameter added to the proposal time field, i.e., its timestamp. So, the fact that a process enters a round before or after, in terms of real time, the proposer does that is not reflected by this parameter. But this lack of synchronization between processes should be taken into consideration when configuring the timeout_propose.

From this, I'm gathering that MSGDELAY does not account for the time at which two process may enter a round at very different times is not included in the inequalities involving MSGDELAY. The different times that two processes may enter a round is still bounded by timeout-propose. Is that roughly correct?

josef-widder commented 3 years ago

So, in particular while we consider MSGDELAY a fixed parameter, that is not incremented when rounds of consensus fail, I propose to remove this association between timeout_propose and MSGDELAY.

@cason, are you saying that if we decide on adaptive timeouts in https://github.com/tendermint/spec/issues/371 we will need to re-introduce the relation of message delay and timeout?

cason commented 3 years ago

@williambanfield, I don't consider it solved. I think that the previous specification does not solve the (possible) problem we have, as it does not embrace some of the considerations above. I also think that the absence of further changes, the timeout and timely mechanism would not interfere with each other.

cason commented 3 years ago

I'm gathering that MSGDELAY does not account for the time at which two process may enter a round

No, it should not. It is the maximum transmission end-to-end delay for a regular consensus message. Maybe I should reinforce it in the specification.

The different times that two processes may enter a round is still bounded by timeout-propose. Is that roughly correct?

Yes, that's why I think they are essentially distinct. Maybe they are not independent, and this is the discussion I foresee for this issue.

cason commented 3 years ago

@josef-widder: the short answer is not, I don't plan to "reassociate" them. But if from this discussion emerges some relevant scenario that justify their association, I am open to reconsider it.

cason commented 2 years ago

Closing this issue as it seems established that these parameters should not be directly associated.

Just as a last comment, when rewriting the problem statement I realized that while we can assume that the delivery of a Proposal occurs within MSGDELAY time units, we cannot affirm that this will actually happen, at least before GST (Global Stabilization Time). The reason is that correct processes might not be at the same round as the Proposal, they can join after or could have left (or moved to prevote step) before the Proposal is received. In addition, we cannot ensure that a message sent by a correct process is received by all correct processes, at least before GST.