Closed torao closed 4 years ago
This is good suggestion. 👍 But the computational calculation is very higher if there is many validators.
In the above steps, a single candidate may assume multiple roles. If you want to exclude such a case, you can remove the winning candidate from the candidates. In this case, the thresholds must be recalculated because the total S of the population changes.
As above your mention, the selected validator is less than V+1. In this case, is the process of this selecting validators repeated?
But the computational calculation is very higher if there is many validators.
The amount of calculation depends on V
, but it consists only of simple addition, random number generation and conditional branching (rather than hash computation). So I roughly expect the cost to be modest, e.g., less than 100ms for V ≤ 10^5.
In this case, is the process of this selecting validators repeated?
Yes, it does. If we follow the case that the configuration doesn't allow a node to have multiple roles, 1) it will remove the selected validator from the candidates' list, 2) recalculate S, and 3) find the next winner, repeat these V
times. Therefore, the cost of step 2) is added, the computational complexity of the above will be about O(N×V) + O(N×(V+1)) ≒ O(2N×V) in worst case.
OK
And how about using hash function like sha256
instead of random function.
I think it mainly depends on its speed. And there are several considerations to using the cryptographic hash function.
Xorshift(vrf_hash || public_key)
.My concern is that the random function can vary from platform to platform and program language and compile version. (https://en.wikipedia.org/wiki/Random_number_generation) So I suggest using the hash function.
Random sampling is better than my proposal because of less calculation. I have two comment about this proposal.
As my proposal suggests, there should be consideration of equity in the case of a duplicate election.
If a participant is elected twice in a round with a lot of staking, it will be as disadvantageous as a reward as a person who splits as much as possible and runs multiple nodes.
* 10000staking-1node rewards < 10staking-1000node rewards in during height h1~hn
So, in this case, duplicate elected validator must have some additional rewards, and additional elected validator should receive a part-reward for the validation job. (Please refer https://github.com/line/tendermint/issues/17)
I'll think more about the reward policy and then make a separate issue.
I think we don't need to calculate this at every round.
for(c <- candidates){
if(cumulativeStakes <= threshold && threshold < cumulativeStakes + c.stake){
proposer = c
break
}
cumulativeStakes += c.stake
}
Validators order is fixed for a while if any staking tx is not executed. So if we store following data, we can verify some proposer or validator at receiving proposal or vote easily.
order | candidate | staking | position |
---|---|---|---|
1 | c100 | 10000 | 0, 10000 |
2 | c068 | 9000 | 10000, 19000 |
3 | c021 | 8000 | 19000, 27000 |
... | ... | ... | ... |
78 | c010 | 1 | 99999, 100000 |
Total staking is 100000. This table is updated when staking tx is executed.
I am a validator and someone propose a block. I know the threshold and his position, so I can verify whether he is the right proposer or not without whole calculating and sorting.
@zemyblue
My concern is that the random function can vary from platform to platform and program language and compile version.
The PRNG mentioned here means an algorithm defined by LINK 2 Network as its specification (rather than languages or external libraries such as libc
). Typical PRNG algorithms, such as Xorshift, LCGs, and MT, produce the same results on all platforms, languages, and compiler versions as long as the algorithm and its parameters are the same.
If a participant is elected twice in a round with a lot of staking, it will be as disadvantageous as a reward as a person who splits as much as possible and runs multiple nodes.
But, it is not disadvantageous, because the voting power is set by the staking value of user. one node is not one voting power. Therefore, rewards is given in proportion to the staking value.
But I'm also thinking about what it's like to give them the right to vote as many as they're elected. It means if the user to have 10 voting power is elected 2 times, the user's voting power is 2. But I am not sure this is good. I need to simulate more.
@egonspace
Duplicate election
This proposal doesn't point out the incentive scheme, so it needs to do separately. We'll discuss this in #17.
A way to reduce computations
That suggestion is useful. It will be able to reuse the results of the categorical distribution until the next stake transaction issued. If the stakes don't change often, we may be able to use an algorithm such as a binary tree, R-Tree or B-Tree instead of a linear search. In this case, 3.winner extraction will be O(V log N).
A problem with this algorithm was raised in the last open session.
Random election cannot protect for some candidate to be a validator having voting power over staking or over 1/3 of total extremely.
Here a few points.
Strictly speaking, getting one node "accidentally" 1/3f+1 is a slightly different issue from getting distributed Byzantines "accidentally" 1/3f+1 of validators. However, the former has a negligible probability compared to the latter, which will occur with an apparent frequency. Therefore, I believe it is reasonable to treat these as "accidental (expected) BFT Violation Problem" in a common manner.
The scheme of selecting a Proposer and Validators based on PoS can be considered as random sampling from a group with a discrete probability distribution.
S
: the total amount of issued stakes_i
: the stake amount held by a candidatei
(Σ s_i = S)Random Sampling based on Categorical Distribution
For simplicity, here is an example in which only a Proposer is selected from candidates with winning probability of
p_i = s_i / S
.First, create a pseudo-random number generator using
vrf_hash
as a seed, and determine the thresholdthreshold
for the Proposer. This random number algorithm should be deterministic and portable to other programming languages, but need not be cryptographic.Second, to make the result deterministic, we retrieve the candidates sorted in descending stake order.
Finally, find the candidate hit by the arrow
threshold
of Proposer.This is a common way of random sampling according to a categorical distribution by using a uniform random number. Similar to throwing an arrow on a spinning darts whose width is proportional to the probability of each item.
Selecting of a Consensus Group
By applying the above, we can select a consensus group consisting of one Proposer and
V
Validators. This is equivalent to performingV+1
categorical trials, which is the same as a random sampling model with a multinomial distribution. It's possible to illustrate this notion using a multinomial distribution demo I created in the past. This is equivalent to a model that selects a Proposer and Validators whenK
is the number of candidates andn=V+1
.As an example of intuitive code, I expand categorical sampling to multinomial.
In the above steps, a single candidate may assume multiple roles. If you want to exclude such a case, you can remove the winning candidate from the
candidates
. In this case, thethresholds
must be recalculated because the totalS
of the population changes.Computational Complexity
The computational complexity is mainly affected by the number of candidates N. There is room for improvement by remembering the list of candidates that have been sorted by the stake.