Closed karalabe closed 2 years ago
As the word censored is in cursive, I'd like to point out that while this proposal proposes a new public testnet with less decentralized characteristics, it's possible for anyone to run their own PoW testnet. Then you bear the infrastructure cost of doing so and the proposal does not limit your ability for this any way. This has been true from Ethereum day zero, as Ethereum clients have been very user friendly for running your private testnet.
Just to add on that, the proposal also does not restrict clients to run this exclusively. The proposal can run side-by-side with the current testnet, so users would be free to chose between the PoW Ropsten or the PoA Rinkeby.
We greatly support this approach! As a DApp Developer, we urgently need a public safe and reliable testnet, which obviously cannot be secured by PoW. DApps are beginning to interact heavily - only to mention status.im, metamask, uport, or other wallets - and only on a broadly accepted public testnet all projects will be present and able to test dependencies to others. For similar reasons, the new testnet should be as similar as possible to the mainnet - only then it can serve as a valid reference for developement. I'd prefer:
@christoph2806 Definitely, added to the proposal's clarification section.
With time some signers can go offline. Couldn't it be the case when at some block all of (N-K) signers who can mint the next block are stale and the network stuck?
For my proposal the network operators should ensure that stale signers are removed/replaced in a timely fashion. For testnet purposes this would probably be only a handful of signers that we can guarantee uptime.
How will the ether be distributed? It is important since spammer can try to get as much ether as possible from various sources and then use it to spam the network.
@hrishikeshio The issue with Ropsten was that the attacker minted tens of thousands of blocks, producing huge reorgs and pushing the gas limit up to 9B. These two scenarios could be avoided since only signers can mint blocks, so they could also retain some sanity limits.
The proposal does not specify any means for spam filtering for individual transactions as that is a new can of worms. I'll have to think a bit how best to solve that issue (around miner strategies), but limiting ether availability on a testnet is imho a bad idea. We want to be as inclusive as possible.
One possible solution would be to have a faucet that grants X ether / Y time (e.g 10 / day) but is bound to some OAuth protocol that has proper protection against mass account creation (e.g. github accout, email address, etc).
Snippet to claim a github user ownership to an ethereum address
contract GitHubOracle is usingOraclize {
//constant for oraclize commits callbacks
uint8 constant CLAIM_USER = 0;
//temporary storage enumerating oraclize calls
mapping (bytes32 => uint8) claimType;
//temporary storage for oraclize user register queries
mapping (bytes32 => UserClaim) userClaim;
//permanent storage of sha3(login) of github users
mapping (bytes32 => address) users;
//events
event UserSet(string githubLogin, address account);
//stores temporary data for oraclize user register request
struct UserClaim {
address sender;
bytes32 githubid;
string login;
}
//register or change a github user ethereum address
function register(string _github_user, string _gistid)
payable {
bytes32 ocid = oraclize_query("URL", strConcat("https://gist.githubusercontent.com/",_github_user,"/",_gistid,"/raw/"));
claimType[ocid] = CLAIM_USER;
userClaim[ocid] = UserClaim({sender: msg.sender, githubid: sha3(_github_user), login: _github_user});
}
//oraclize response callback
function __callback(bytes32 _ocid, string _result) {
if (msg.sender != oraclize_cbAddress()) throw;
uint8 callback_type = claimType[_ocid];
if(callback_type==CLAIM_USER){
if(strCompare(_result,"404: Not Found") != 0){
address githubowner = parseAddr(_result);
if(userClaim[_ocid].sender == githubowner){
_register(userClaim[_ocid].githubid,userClaim[_ocid].login,githubowner);
}
}
delete userClaim[_ocid]; //should always be deleted
}
delete claimType[_ocid]; //should always be deleted
}
function _register(bytes32 githubid, string login, address githubowner)
internal {
users[githubid] = githubowner;
UserSet(login, githubowner);
}
}
User create a gist with his public address and call register passing _github_user
+ _gistid
From https://github.com/ethereans/github-token/blob/master/contracts/GitHubToken.sol
There could be a light quick proof of stake system where (like the github oraclize above) people need 5ETH locked to a live net contract address that then allows them to be on the testnet. Misbehave, and the ethereum foundation (or who ever runs it) confiscates your eth.
Yeah, side chains are an interesting idea but those are a whole new can of worms :)
Two thoughts:
Last week, INFURA launched a (private but publicly available) chain called INFURAnet (with INFURA running all the authorities) to provide a usable test network in the face of the Ropsten issues. It was obviously based on Parity but we would feel better if PoA was a standard and compatible feature across all clients. Therefore, we support this EIP.
Additionally, if Ropsten is replaced with a PoA network, we would be happy to run one of the authorities.
What about still using PoW on the testnet, but with slightly modified parameters:
1) Block Reward = 0 2) Gas price is fixed to certain value 3) There is a hard cap on the gas limit in a block 4) Faucet gives testnet Ether only to accounts that have Ether in the same account on the main net, and that Ether is at least 24 hours old. Each account only receives test Ether once. Or some other limitation of this sort, which will allow faucet to be automatic, but will limit sybil attacks.
Hopefully, implementation could be much easier than Proof Of Authority
EDIT: Another idea - can Block Reward be negative? Meaning that mining actually cost Test Ether. That allows implementing sort of "Proof Of Authority" trivially, by simply distributing large amounts of test Ether. It also means that if Test Ether is dished out periodically, the maintainers of the test net can disallow abusive miners by not giving them the next tranche of test Ether
The issue with your modified PoW scheme is that it still permits creating huge reorgs by mining lots of blocks, even if without reward.
The second proposal doesn't solve this issue either as a malicious user might accumulate a lot of ether first, then create many many parallel chains. All will be valid since he does have the funds, and there's no way to take it away. Arguably more stable than the first proposal, but doing negative rewards might break clients unexpectedly as I don't think most codebases catered for this possibility.
Btw, the zero block reward is a nice idea for PoA too, as it prevents a rogue signer / leaked key from ruining the chain with accumulated funds.
@karalabe Thanks! What I meant with the negative rewards - the maintainer of the network gives out enough Test Eth to current miner authorities to mine, lets say, for a week. After the week, the maintainer looks who needs a top-up, and only gives a top up to miners who behaved well. For those who did not behave well, the payouts simply stop.
@karalabe Ah, I got your point about the parallel chains now. In that case, there needs to be some kind of regular expiration of Test Eth :)
Here's GoEthereum on Tendermint.
https://github.com/tendermint/ethermint
The goal is to make as much of GoEthereum as compatible as possible.
Come to #ethermint on the Tendermint slack for discussions.
We have some upstream patches that would make Ethermint much cleaner. See the bottom of https://github.com/tendermint/ethermint/pull/42/files
We're pushing GoEthereum to high tx limits and uncovering some issues.
Just to mention a proposal by @frozeman and @fjl of adding the set of signers to the extra-data field of every X block to act as a checkpoint. This wouldn't be useful now, but it would permit anyone trivially adding a logic to "sync form H(X)" where H(X)
is the hash of a checkpoint block.
The added benefit is that this would allow the genesis block to store the initial set of signers and we wouldn't need extra chain configuration parameters.
Here's a suggested protocol change: https://gist.github.com/holiman/5e021b24a7bfec95c8cc84b97e44e45a
It was a bit too long for fitting in a comment.
@holiman To react a bit to the proposal here too, I see one problem that's easy-ish to solve, another that's hard:
Your scheme must also ensure that blocks cannot be minted like crazy, otherwise the difficulty becomes irrelevant. This can be done with the same "min 15 seconds apart" guarantee that the original proposal had.
The harder part is that with no guarantee on signer ordering/frequency (only relying on the difficulty for chain quality/validation), malicious signers can mine very long chains that aren't difficult enough to beat the canonical, however the nodes cannot know this before processing them. And since creating these chains is mostly free in a PoA world, malicious signers can keep spamming with little effort.
The original proposal had a guarantee that the majority of the signers agreed at some point that a chain is valid (even if it was reorged afterwards), so minority malicious miners can only feed made up chains of N/2 blocks.
The difficulty idea is elegant btw, just not sure how yet to make use of it :)
If you do not mind somewhat relying on UNIX time and longer block times when validators are down, then Aura (in Parity) uses something like that:
t / step_duration
step
is step % length(validators)
step
and signature (step
is redundant and can be removed in a future version)U128_max * height - step
Validation: block at a given step
can be only signed by the primary, only first block for a given step
is accepted (if a second is received, a vote to remove the authority should be issued), block can arrive at most 1 step ahead.
Validator set can be altered in the way @karalabe proposed.
Either way we will attempt to implement whichever solution is elected.
I'm not too fond of relying on time. Using @holiman 's proposal of calculating "your turn" based only on block height seems a bit better in respect as nodes don't have to be synced.
Any particular reason for having the chain difficulty calculated like that instead of just the height of the chain for example? What does this more complex formula gain you?
The issue I see with Aura's turn based scheme is that if a few signers drop off (which can be only natural in an internet scale system), then the chain dynamics would become quite irregular, with "gaps" in the minting time; versus my proposal where multiple signers can fill in for those that dropped.
If I understand correctly, the idea in the difficulty algorithm is to score those chains higher that have the most signers signing at the correct turn. So chains that skip blocks are scored less vs. those that include all signers.
What happens in scenarios where blocks are minted in step, but propagated later after the step ends? Or if some signers receive the next block in time, while some signers receive it a bit later after the step ended?
I've updated the proposal with a tech spec section describing the proposed PoA protocol itself. It's still missing a few details around signing (notably the 1-out-of-K block constraint), and I've yet to figure out the difficulty calculation.
Also I split off the PoA protocol from the testnet itself naming wise as I'd like to keep the two concepts separated to avoid confusion. Using metro station names for the testnets is fine, but for a reusable PoA scheme I wanted something a bit more "mundane" and/or obvious.
The names are still up for finalization. The Clique
name for the PoA scheme (best until now) was suggested by @holiman .
Id recommend using the Ethermint or Eris DB permissioning native contract or both. They've both been tested extensively and both would not require reinventing the wheel. Furthermore we're all friends here and have done the heavy leg work here so...why not?
It's hard to evaluate such a proposal without any details. I personally am not familiar with either how the work, so I cannot comment on their feasibility.
My main design goals here are to be easy to add to any client and support current techs (fast, light, warp sync) without invasive changes.
Can those consensus engines be plugged into all clients? Can they run on mobile and embedded devices? Are they fully self contained without external dependencies? Can they achieve consensus header only? Are they compatible licensing wise with all clients? These all are essential requirements I've tried to meet.
I'm happy to consider them, but you need to provide a lot more detail to evaluate based upon.
Absolutely.
So both use a tendermint consensus Proof of Stake, that is detailed here:
https://github.com/tendermint/tendermint/wiki/Byzantine-Consensus-Algorithm
As for the pluggability of the algorithm, it's been proven to be quite doable, in fact, Parity has already done it:
And ethermint already implements this through geth in a way (I wouldn't be the one to give the details, that would be something for @jaekwon or @ebuchman to explain)
https://github.com/tendermint/ethermint
As for Eris-DB and your attempt at permissioning by way of Proof Of Authority, we simply utilize the above BFT consensus algorithm and on top of that utilize a native contract (not dissimilar to the current cryptographic addresses such as SHA256, RIPEMD-160, etc.) to implement a permissioning scheme amongst the validators.
While we have our own version of the EVM that is much more stripped down than Geth, I don't think it would be something difficult to make a modular go package for ease of implementation (CC @silasdavis ):
https://github.com/eris-ltd/eris-db/blob/master/manager/eris-mint/evm/snative.go#L73
The above ^^^ could be implemented in a way through geth via some tinkering with this function in geth:
https://github.com/ethereum/go-ethereum/blob/master/core/vm/contracts.go#L33
Both solutions are written in Golang so there is surely a way to make them somewhat compatible. Again. Trying to find a way to work together so ya'll can keep your focus ;)
Maybe instead of all these fancy ideas just ask Bitcoin how they are able to have functional PoW testnet? Hint: block size (i.e., gas limit) is bounded.
But off-course we cannot allow testnet to have different behavior than mainnet. So let's us PoA instead. Exactly as in mainnet.
We could have a bounded-limit PoW network as well. Let's have several options.
Could the PoA testnet be started from a state snapshot taken from the PoW testnet (perhaps from the Ropsten bounded-gas-limit soft-fork block)? And if the PoA configuration uses the same EIP155 CHAIN_ID=3
as Ropsten, then transactions can be replayed on both the PoA chain and the PoW chain. Replaying transactions on both testnets might be convenient for deploying contracts etc.
I'm not convinced that's a good idea.
Imho it's nicer to start with a clean slate.
cdetrio does not suggest a snapshot feature (as far as I understand). Just have same network id and replay all ropsten txs until the attack.
I don't understand why everyone keep claiming the amount of ether the attacker had was the problem. IMO it was his (relatively) huge mining power. If gas limit would stayed at 4.7M he couldn't spam as much.
PoA doesn't have mining rewards and the block miners would be different, the the transactions couldn't be replayed as is, since the accounts wouldn't have the funds.
Noone claimed the ether was the problem. We highlighted that with infinite ether, you can repro the same problem in a PoA network too without much mining power if blocks are not limited.
Technically you can make a fresh account with many ethers, and after every ropsten block was mined add a tx that gives the miner the block reward and fees. I am not suggesting to do it. Just wondering if this is what @cdetrio had in mind.
If ether amount is not a problem (given block size is bounded), why do you insist to verify an indentity before giving away ether?
I personally don't want to place a limit on the block size. Looking at bitcoin, they have huge problems because of that limit. Even though this is a testnet, I'd like to retain the core concepts of Ethereum (yes, I know PoA isn't mainnet, but Ethereum never wanted to settle on PoW anyway, so I see no issue with pushing towards dropping PoW).
Do you think it wise not to have PoW testnet at all, while mainnet is still PoW?
Personally, I have an agenda here. I am part of the smartpool.io team. And it will be hard to deploy it on mainnet before we can show people it works on testnet (we have our own private network but it is not the same).
I don't know how many more people need PoW feature. I think metropolis has some changes in uncle mechanism. How can they test it without a PoW testnet?
It's fine to have a PoW testnet too beside a PoA one to test out forks. We can go down the block limiting route on that.
Just wanted to ping the thread that I've finished writing up the proposal. We also have a prototype implementation in go-ethereum https://github.com/ethereum/go-ethereum/pull/3753, in the consensus/clique
package (I didn't link the commit because occasionally I force push the PR during development).
I'll spend the next few days trying to put together a small beta-test network and also to write up some tests to validate that everything works correctly (mostly around voting and dynamic signer updates).
@VoR0220 I'm still uncertain whether I understand your two proposals, but I did notice a few things that made me uncertain whether they would be appropriate.
Tendermint seems to rely on a complex cross node interaction to reach consensus on the final block, which inherently means added network complexity. Eris DB seems to be based on a slimmed down EVM, which inherently means that stateless syncs (fast, light) cannot verify the chain. Did I misunderstand something?
All in all though to support my proposal or any of your proposals, clients need to have support for some baseline pluggable consensus engines, so either approach requires work from core devs. I'm not sure about the other proposals, but at least after implementing mine I can guarantee that the PoA and previous PoW can be done without too invasive rewrites, although non trivial to say the truth.
Here's an alternative idea: Keep the list in the contract for flexibility. The contract emits events when the list changes. Light/fast sync can examine event blooms and transaction receipts and downloads proofs for the changes. The proofs are also added to the warp snapshot.
The idea is not bad per se, but it blows up the complexity of the proposal significantly:
Light clients don't have access to receipts during sync, so every time the event bloom looks like there's something there, the light client needs to retrieve it. This means that sync code and consensus code all of a sudden get tied together, since sync needs to occasionally pull in extra data. This is quite a large can of worms to open up, especially since there might be much stricter resource constraints on light clients for network traffic, as well as serving nodes may throttle them on large downloads.
The scheme is susceptible to attack that fake "consensus updates" in the log bloom. E.g. I as an attacker can issue a transaction per block that emits some logs which map to the same bloom bits as the consensus contract events. This means that light clients will end up needing to pull in all recepts and a ton of state just to figure out it's a false alarm.
But perhaps most importantly, one of the core requirements of the original proposal was that it should be trivial to embed into other clients. Of course they do need to support some consensus engine pluggability, but based on the code in geth, the entire Clique consensus engine can be done (extremely commented) in 500-750 lines of code, fully self contained into two files. (My pr contains a lot of general cleanup and also reworking ethash in the mean time). The entire proposal depends on implementing a "header check", a "header preparer" and a "sign block" method, which are analogous to those needed by ethash. All else works just as is. Imho this is a very strong benefit that should not be discarded lightly.
Light clients don't have access to receipts during sync
LES protocol supports GetReceipts
message.
The scheme is susceptible to attack that fake "consensus updates" in the log bloom. E.g. I as an attacker can issue a transaction per block that emits some logs which map to the same bloom bits as the consensus contract events. This means that light clients will end up needing to pull in all recepts and a ton of state just to figure out it's a false alarm.
This would require reversing a Keccak hash, wouldn't it?
As for traffic increase, list modifications are expected to be rare enough for it to be negligible.
Does not look much harder to implement to me. Trivial for clients that don't support fast sync or light client protocol. And it does not impose a hard-coded governance scheme.
LES protocol supports GetReceipts message.
That's a significant overhead to call during syncing.
This would require reversing a Keccak hash, wouldn't it?
The blooms don't use the full hash, only a few bytes from it, so it should be significantly easier to brute force. Given that the consensus contract's address wouldn't change, it shouldn't be too much of an effort to try and break it.
As for traffic increase, list modifications are expected to be rare enough for it to be negligible.
Not if I can attack it.
Trivial for clients that don't support fast sync or light client protocol.
Given that CPP is just working on adding fast sync and I assume light is next for many client implementations, that's just taking a shortcut now that will bite us hard in the long run.
And it does not impose a hard-coded governance scheme.
That hard coded governance is PoA by majority consensus. I don't see a reason to make it more flexible than this.
I can see both sides for this:
The annoying part of scalable-ish PoA is managing authorised signers. Doing it with a contract is easier because the logic can be shared solidity code and arbitrary new signer management policies can be implemented later.
But implementing it as a contract also adds non-trivial development overhead now because blockchain syncing gets more complicated. @arkpar, I guess you could answer these:
Contract-based PoA is already implemented in Parity, just without conveniences for light clients. Clique/Rinkeby probably wouldn't be a whole lot to implement, but @keorn can answer better.
I would favor a middle ground. A generic validators contract has one method: getValidators() -> [Address]
. We can include the signed sha3(getValidators())
as part of the seal for any given block. Light clients can simply fetch fraud proofs when this changes. In the event that some malicious validators don't update the field even when getValidators()
would be different, a mandate to follow the longest chain and an honest majority assumption is enough to ensure that the correct chain is synchronized to.
This will work most efficiently with infrequent changes in the validator contract. If they are epoch-based at around once per day, the overhead imposed on light clients synchronizing would not be very high, although there is a stronger availability requirement on the network to continue to store getValidators()
state proofs for ancient transitions.
Hi i've some questions on this:
Thanks
A signed block doesn't make it valid. All the yellow paper rules still apply, the signature is just one more requirement. Empty block are not invalid, a signer is free not to include any transactions.
Broadcasting and mining is the same as for all consensus engines. Transactions propagate all over the network, signers aggregate them and include them in blocks when it's their turn (or possibly out of turn too for less difficulty).
Changelog:
Clique proof-of-authority consensus protocol
Note, for the background and rationale behind the proposed proof-of-authority consensus protocol, please read the sections after this technical specification. I've placed this on top to have an easy to find reference for implementers without having to dig through the discussions.
We define the following constants:
EPOCH_LENGTH
: Number of blocks after which to checkpoint and reset the pending votes.30000
for the testnet to remain analogous to the mainnetethash
epoch.BLOCK_PERIOD
: Minimum difference between two consecutive block's timestamps.15s
for the testnet to remain analogous to the mainnetethash
target.EXTRA_VANITY
: Fixed number of extra-data prefix bytes reserved for signer vanity.32 bytes
to retain the current extra-data allowance and/or use.EXTRA_SEAL
: Fixed number of extra-data suffix bytes reserved for signer seal.65 bytes
fixed as signatures are based on the standardsecp256k1
curve.NONCE_AUTH
: Magic nonce number0xffffffffffffffff
to vote on adding a new signer.NONCE_DROP
: Magic nonce number0x0000000000000000
to vote on removing a signer.UNCLE_HASH
: AlwaysKeccak256(RLP([]))
as uncles are meaningless outside of PoW.DIFF_NOTURN
: Block score (difficulty) for blocks containing out-of-turn signatures.1
since it just needs to be an arbitrary baseline constant.DIFF_INTURN
: Block score (difficulty) for blocks containing in-turn signatures.2
to show a slight preference over out-of-turn signatures.We also define the following per-block constants:
BLOCK_NUMBER
: Block height in the chain, where the height of the genesis is block0
.SIGNER_COUNT
: Number of authorized signers valid at a particular instance in the chain.SIGNER_INDEX
: Index of the block signer in the sorted list of current authorized signers.SIGNER_LIMIT
: Number of consecutive blocks out of which a signer may only sign one.floor(SIGNER_COUNT / 2) + 1
to enforce majority consensus on a chain.We repurpose the
ethash
header fields as follows:beneficiary
: Address to propose modifying the list of authorized signers with.nonce
: Signer proposal regarding the account defined by thebeneficiary
field.NONCE_DROP
to propose deauthorizingbeneficiary
as a existing signer.NONCE_AUTH
to propose authorizingbeneficiary
as a new signer.extraData
: Combined field for signer vanity, checkpointing and signer signatures.EXTRA_VANITY
bytes (fixed) may contain arbitrary signer vanity data.EXTRA_SEAL
bytes (fixed) is the signer's signature sealing the header.N*20 bytes
) in between, omitted otherwise.mixHash
: Reserved for fork protection logic, similar to the extra-data during the DAO.ommersHash
: Must beUNCLE_HASH
as uncles are meaningless outside of PoW.timestamp
: Must be at least the parent timestamp +BLOCK_PERIOD
.difficulty
: Contains the standalone score of the block to derive the quality of a chain.DIFF_NOTURN
ifBLOCK_NUMBER % SIGNER_COUNT != SIGNER_INDEX
DIFF_INTURN
ifBLOCK_NUMBER % SIGNER_COUNT == SIGNER_INDEX
Authorizing a block
To authorize a block for the network, the signer needs to sign the block's hash containing everything except the signature itself. The means that the hash contains every field of the header (
nonce
andmixDigest
included), and also theextraData
with the exception of the 65 byte signature suffix. The fields are hashed in the order of their definition in the yellow paper.This hash is signed using the standard
secp256k1
curve, and the resulting 65 byte signature (R
,S
,V
, whereV
is0
or1
) is embedded into theextraData
as the trailing 65 byte suffix.To ensure malicious signers (loss of signing key) cannot wreck havoc in the network, each singer is allowed to sign maximum one out of
SIGNER_LIMIT
consecutive blocks. The order is not fixed, but in-turn signing weighs more (DIFF_INTURN
) than out of turn one (DIFF_NOTURN
).Authorization strategies
As long as signers conform to the above specs, they can authorize and distribute blocks as they see fit. The following suggested strategy will however reduce network traffic and small forks, so it's a suggested feature:
BLOCK_PERIOD
).rand(SIGNER_COUNT * 500ms)
.This small strategy will ensure that the in-turn signer (who's block weighs more) has a slight advantage to sign and propagate versus the out-of-turn signers. Also the scheme allows a bit of scale with the increase of the number of signers.
Voting on signers
Every epoch transition (genesis block included) acts as a stateless checkpoint, from which capable clients should be able to sync without requiring any previous state. This means epoch headers must not contain votes, all non settled votes are discarded, and tallying starts from scratch.
For all non-epoch transition blocks:
SIGNER_LIMIT
come into effect immediately.A proposal coming into effect entails discarding all pending votes for that proposal (both for and against) and starting with a clean slate.
Cascading votes
A complex corner case may arise during signer deauthorization. When a previously authorized signer is dropped, the number of signers required to approve a proposal might decrease by one. This might cause one or more pending proposals to reach majority consensus, the execution of which might further cascade into new proposals passing.
Handling this scenario is non obvious when multiple conflicting proposals pass simultaneously (e.g. add a new signer vs. drop an existing one), where the evaluation order might drastically change the outcome of the final authorization list. Since signers may invert their own votes in every block they mint, it's not so obvious which proposal would be "first".
To avoid the pitfalls cascading executions would entail, the Clique proposal explicitly forbids cascading effects. In other words: Only the
beneficiary
of the current header/vote may be added to/dropped from the authorization list. If that causes other proposals to reach consensus, those will be executed when their respective beneficiaries are "touched" again (given that majority consensus still holds at that point).Voting strategies
Since the blockchain can have small reorgs, a naive voting mechanism of "cast-and-forget" may not be optimal, since a block containing a singleton vote may not end up on the final chain.
A simplistic but working strategy is to allow users to configure "proposals" on the signers (e.g. "add 0x...", "drop 0x..."). The signing code can then pick a random proposal for every block it signs and inject it. This ensures that multiple concurrent proposals as well as reorgs get eventually noted on the chain.
This list may be expired after a certain number of blocks / epochs, but it's important to realize that "seeing" a proposal pass doesn't mean it won't get reorged, so it should not be immediately dropped when the proposal passes.
Background
Ethereum's first official testnet was Morden. It ran from July 2015 to about November 2016, when due to the accumulated junk and some testnet consensus issues between Geth and Parity, it was finally laid to rest in favor of a testnet reboot.
Ropsten was thus born, clearing out all the junk and starting with a clean slate. This ran well until the end of February 2017, when malicious actors decided to abuse the low PoW and gradually inflate the block gas limits to 9 billion (from the normal 4.7 million), at which point sending in gigantic transactions crippling the entire network. Even before that, attackers attempted multiple extremely long reorgs, causing network splits between different clients, and even different versions.
The root cause of these attacks is that a PoW network is only as secure as the computing capacity placed behind it. Restarting a new testnet from zero wouldn't solve anything, since the attacker can mount the same attack over and over again. The Parity team decided to go with an emergency solution of rolling back a significant number of blocks, and enacting a soft-fork rule that disallows gas limits above a certain threshold.
While this solution may work in the short term:
Parity's solution although not perfect, is nonetheless workable. I'd like to propose a longer term alternative solution, which is more involved, yet should be simple enough to allow rolling out in a reasonable amount of time.
Standardized proof-of-authority
As reasoned above, proof-of-work cannot work securely in a network with no value. Ethereum has its long term goal of proof-of-stake based on Casper, but that is heavy research so we cannot rely on that any time soon to fix today's problems. One solution however is easy enough to implement, yet effective enough to fix the testnet properly, namely a proof-of-authority scheme.
Note, Parity does have an implementation of PoA, though it seems more complex than needed and without much documentation on the protocol, it's hard to see how it could play along with other clients. I welcome feedback from them on this proposal from their experience.
The main design goals of the PoA protocol described here is that it should be very simple to implement and embed into any existing Ethereum client, while at the same time allow using existing sync technologies (fast, light, warp) without needing client developers to add custom logic to critical software.
Proof-of-authority 101
For those not aware of how PoA works, it's a very simplistic protocol, where instead of miners racing to find a solution to a difficult problem, authorized signers can at any time at their own discretion create new blocks.
The challenges revolve around how to control minting frequency, how to distribute minting load (and opportunity) between the various signers and how to dynamically adapt the list of signers. The next section defines a proposed protocol to handle all these scenarios.
Rinkeby proof-of-authority
There are two approaches to syncing a blockchain in general:
A PoA scheme is based on the idea that blocks may only be minted by trusted signers. As such, every block (or header) that a client sees can be matched against the list of trusted signers. The challenge here is how to maintain a list of authorized signers that can change in time? The obvious answer (store it in an Ethereum contract) is also the wrong answer: fast, light and warp sync don't have access to the state during syncing.
The protocol of maintaining the list of authorized signers must be fully contained in the block headers.
The next obvious idea would be to change the structure of the block headers so it drops the notions of PoW, and introduces new fields to cater for voting mechanisms. This is also the wrong answer: changing such a core data structure in multiple implementations would be a nightmare development, maintenance and security wise.
The protocol of maintaining the list of authorized signers must fit fully into the current data models.
So, according to the above, we can't use the EVM for voting, rather have to resort to headers. And we can't change header fields, rather have to resort to the currently available ones. Not much wiggle room.
Repurposing header fields for signing and voting
The most obvious field that currently is used solely as fun metadata is the 32 byte extra-data section in block headers. Miners usually place their client and version in there, but some fill it with alternative "messages". The protocol would extend this field
towith 65 bytes with the purpose of a secp256k1 miner signature. This would allow anyone obtaining a block to verify it against a list of authorized signers. It also makes the miner section in block headers obsolete (since the address can be derived from the signature).Note, changing the length of a header field is a non invasive operation as all code (such as RLP encoding, hashing) is agnostic to that, so clients wouldn't need custom logic.
The above is enough to validate a chain, but how can we update a dynamic list of signers. The answer is that we can repurpose the newly obsoleted miner field and the PoA obsoleted nonce field to create a voting protocol:
0
or0xff...f
to vote in favor of adding or kicking outAny clients syncing the chain can "tally" up the votes during block processing, and maintain a dynamically changing list of authorized signers by popular vote.
The initial set of signers can be given as genesis chain parameters (to avoid the complexity of deploying an "initial voters list" contract in the genesis state).To avoid having an infinite window to tally up votes in, and also to allow periodically flushing stale proposals, we can reuse the concept of an epoch from ethash, where every epoch transition flushes all pending votes. Furthermore, these epoch transitions can also act as stateless checkpoints containing the list of current authorized signers within the header extra-data. This permits clients to sync up based only on a checkpoint hash without having to replay all the voting that was done on the chain up to that point. It also allows the genesis header to fully define the chain, containing the list of initial signers.
Attack vector: Malicious signer
It may happen that a malicious user gets added to the list of signers, or that a signer key/machine is compromised. In such a scenario the protocol needs to be able to defend itself against reorganizations and spamming. The proposed solution is that given a list of N authorized signers, any signer may only mint 1 block out of every K. This ensures that damage is limited, and the remainder of the miners can vote out the malicious user.
Attack vector: Censoring signer
Another interesting attack vector is if a signer (or group of signers) attempts to censor out blocks that vote on removing them from the authorization list. To work around this, we restrict the allowed minting frequency of signers to 1 out of N/2. This ensures that malicious signers need to control at least 51% of signing accounts, at which case it's game over anyway.
Attack vector: Spamming signer
A final small attack vector is that of malicious signers injecting new vote proposals inside every block they mint. Since nodes need to tally up all votes to create the actual list of authorized signers, they need to track all votes through time. Without placing a limit on the vote window, this could grow slowly, yet unbounded. The solution is to place a
movingwindow of W blocks after which votes are considered stale.A sane window might be 1-2 epochs.We'll call this an epoch.Attack vector: Concurrent blocks
If the number of authorized signers are N, and we allow each signer to mint 1 block out of K, then at any point in time N-K+1 miners are allowed to mint. To avoid these racing for blocks, every signer would add a small random "offset" to the time it releases a new block. This ensures that small forks are rare, but occasionally still happen (as on the main net). If a signer is caught abusing it's authority and causing chaos, it can be voted out.
Notes
Does this suggest we use a censored testnet?
So and so. The proposal suggests that given the malicious nature of certain actors and given the weakness of the PoW scheme in a "monopoly money" network, it is better to have a network with a bit of spam filtering enabled that developers can rely on to test their programs vs. to have a wild wild west chain that dies due to its uselessness.
Why standardize proof-of-authority?
Different clients are better at different scenarios. Go may be awesome in capable server side environments, but CPP may be better suited to run on an RPI Zero. Having a possibility to mix clients in private environments too would be a net win for the ecosystem, as well as being able to participate in a single spamless testnet would be a win for everyone at large.
Doesn't manual voting get messy?
This is an implementation detail, but signers may implement contract based voting strategy leveraging the full capabilities of the EVM, only pushing the results into the headers for average nodes to verify.
Clarifications and feedback