ethereum / EIPs

The Ethereum Improvement Proposal repository
https://eips.ethereum.org/
Creative Commons Zero v1.0 Universal
12.87k stars 5.27k forks source link

Long-term gas cost changes for IO-heavy operations to mitigate transaction spam attacks #150

Closed vbuterin closed 7 years ago

vbuterin commented 8 years ago

EDITOR UPDATE (2017-08-15): This EIP is now located at https://github.com/ethereum/EIPs/blob/master/EIPS/eip-150.md. Please go there for the correct specification. The text below may be incorrect or outdated, and is not maintained.

UPDATE: version 1c of this spec has been implemented and is active on the mainnet as of block 2463000. Spec is kept unmodified for archival purposes.

Specification (version 1)

If block.number >= FORK_BLKNUM, then:

That is, substitute:

        extra_gas = (not ext.account_exists(to)) * opcodes.GCALLNEWACCOUNT + \
            (value > 0) * opcodes.GCALLVALUETRANSFER
        submsg_gas = gas + opcodes.GSTIPEND * (value > 0)
        if compustate.gas < gas + extra_gas:
            return vm_exception('OUT OF GAS', needed=gas+extra_gas)

With:

        extra_gas = (not ext.account_exists(to)) * opcodes.GCALLNEWACCOUNT + \
            (value > 0) * opcodes.GCALLVALUETRANSFER
        if compustate.gas < extra_gas:
            return vm_exception('OUT OF GAS', needed=extra_gas)
        if compustate.gas < gas + extra_gas:
            gas = compustate.gas - extra_gas
        submsg_gas = gas + opcodes.GSTIPEND * (value > 0)

Specification (version 1b)

All of the above, but:

All of the above, and:

If SUICIDE hits a newly created account, it triggers an additional gas cost of 25000 (similar to CALLs)

Specification (version 2)

If block.number >= METROPOLIS_FORK_BLKNUM, then:

When executing EXTCODESIZE, EXTCODECOPY, CALL, CALLDELEGATE or CALLCODE (but NOT BALANCE), let CODELOADING_GAS be int(400 + len(code) / 6). At the end of the call, refund an additional 4000 - CODELOADING_GAS (if CODELOADING < 0, refund nothing). CREATE only provides 63/64 of the parent gas to the child call.

Rationale

Recent denial-of-service attacks have shown that opcodes that read the state tree are under-priced relative to other opcodes. There are software changes that have been made, are being made and can be made in order to mitigate the situation; however, the fact will remain that such opcodes will be by a substantial margin the easiest known mechanism to degrade network performance via transaction spam. The concern arises because it takes a long time to read from disk, and is additionally a risk to future sharding proposals as the "attack transactions" that have so far been most successful in degrading network performance would also require tens of megabytes to provide Merkle proofs for. This EIP increases the cost of storage reading opcodes to address this concern. The costs have been derived from an updated version of the calculation table used to generate the 1.0 gas costs: https://docs.google.com/spreadsheets/d/15wghZr-Z6sRSMdmRmhls9dVXTOpxKy8Y64oy9MvDZEQ/edit#gid=0 ; the rules attempt to target a limit of 8 MB of data that needs to be read in order to process a block, and include an estimate of 500 bytes for a Merkle proof for SLOAD and 1000 for an account.

The first version of the EIP aims to be simple, and adds a flat penalty of 300 gas on top of the costs calculated in this table to account for the cost of loading the code (~17-21 kb in the worst case). The second version of the EIP instead adds an explicit parameter to take into account the size of code loading. Note that this parameter is weighted ~60% below the computed weight; this is to account for the fact that loading a large contiguous of code is, on a per-byte basis, much more efficient in terms of disk seeks (which often have a minimum size of 4kb regardless of the size of the actual object) than making several Merkle queries. A worst-case contract would take 3000 gas to load; a contract 2kb in size would take ~720 gas. This also has the benefit that it prevents a potential DoS vector against precomputation-heavy JIT VMs, where an attacker calls a large contract, requires the VM to just-in-time compile its entire code, only executes for perhaps ten cycles and then leaves. Version 2b fixes a problem where EXTCODESIZE is called with a small amount of gas, so that execution tries to fetch the contract code but runs out-of-gas upon seeing its size, thereby adding a large number of bytes to the merkle proof (and required DB load unless proper caching is added) but still only costing a relatively small amount of gas.

BALANCE is not affected by the per-byte changes because checking an account's balance does not require loading its code.

The EIP 90 gas mechanic is introduced because without it, all current contracts that make calls would stop working as they use an expression like msg.gas - 40 to determine how much gas to make a call with, relying on the gas cost of calls being 40. In the more complex version, EIP 114 is introduced because, given that we are making the cost of a call higher and less predictable, we have an opportunity to do it at no extra cost to currently available guarantees, and so we also achieve the benefit of replacing the call stack depth limit with a "softer" gas-based restriction, thereby eliminating call stack depth attacks as a class of attack that contract developers have to worry about and hence increasing contract programming safety. Note that with the given parameters, the de-facto maximum call stack depth is limited to ~340 (down from ~1024), mitigating the harm caused by any further potential quadratic-complexity DoS attacks that rely on calls.

The gas limit increase is recommended so as to preserve the de-facto transactions-per-second processing capability of the system for average contracts.

nmushegian commented 8 years ago

Very surprising that the fix is more guessing at constants, this is practically equivalent to any other solution that makes the right costs quadratic. Could there not be a market mechanism for gas costs of different opcodes?

vbuterin commented 8 years ago

IMO that would add much more complexity than either of my own proposals. I have thought about market mechanisms for opcode prices, as has Gav, but so far I have not seen anything that I would not consider exploitable. That said, pricing by category is something that I have looked into within narrow ranges, particularly in the context of state size control vs block processing time control.

Gustav-Simonsson commented 8 years ago

version 2 seems like a good first step. With CALL, CALLDELEGATE, CALLCODE at 400 it may be OK to set SLOAD a bit lower, perhaps to 100. I'm guessing it's cost is more important to complex dapps that require more storage loads, and in particular, SLOAD is harder to abuse DoS wise because it already requires the cost of calling a contract - clients can easier optimize storage reads for an already-loaded contract compared to e.g. EXTCODESIZE calls to a ton of (random) contracts.

We should do some benchmarks in both geth and parity to confirm this.

Though wouldn't aim for perfection here - rather encourage multiple gas tuning HFs to continuously approach optimal values. HFs that change gas costs are significantly easier than other type of consensus protocol changes, and also much easier to get community support around.

Right now the most important thing is to tune these codes approx 1 order of magnitude to avoid severe DoS.

Tuning of gas costs reminds me of the tuning Blizzard has done over the years of fighting units (e.g. speed, damage) in Starcraft I and II. Impossible to get 100% right - every tuning causes pro gamers to find so-far-unknown strategies where some units are a bit off balance. But over 20 or so turnings in each game, the end result is extremely balanced.

chfast commented 8 years ago

There is also an EIP https://github.com/ethereum/EIPs/issues/142 to lower SSTORE cost. Worth considering here?

cdetrio commented 8 years ago

The costs have been derived from an updated version of the calculation table used to generate the 1.0 gas costs: https://docs.google.com/spreadsheets/d/15wghZr-Z6sRSMdmRmhls9dVXTOpxKy8Y64oy9MvDZEQ/edit#gid=0 ;

For reference, this appears to be the original 1.0 gas costs table: https://docs.google.com/spreadsheets/d/1m89CVujrQe5LAFJ8-YAUCcNK950dUzMQPMJBxRtGCqs/edit#gid=0

We should do some benchmarks in both geth and parity to confirm this.

The tables above were presumably generated by benchmarks. Outlining the benchmarking methodology would help with measuring costs across different implementations.

vbuterin commented 8 years ago

Only one column of that table was generated by benchmarks: the computation table. The other columns are either estimates (as in the case of Merkle proofs) or common sense (1 byte = 1 byte, etc). In this EIP, it's only the other columns that are an issue.

axic commented 8 years ago

@chfast:

There is also an EIP #142 to lower SSTORE cost. Worth considering here?

It is not about lowering, but semantically separating the two costs it has: processing (CPU) and storage.

chfast commented 8 years ago

@axic, fine. But as I understand it, the side effect of the will be lowered gas cost of storage modification (non-zero value to non-zero value).

wanderer commented 8 years ago

how will backwards compatibility work for old contracts?

CJentzsch commented 8 years ago

Market for contract gasPrices:

I agree that a market for opcodes gas prices doesn't seem to be feasible. But there could be a market for gasPrices of certain contracts. Each miner would benchmark his performance on certain contracts and derives an individual gasPrice for a certain codehash. The benchmarking itself is a overhead cost which should be covered by a relatively high gasPrice for CREATE (opcode and transactions which create a contract).

The good thing with this solution is, that no protocol change would be needed, it's only an optimization for the miners. Although I still think an improvement on pricing of opcodes should be done asap.

A problem is the use of CALL's, where the code executed can depend on input parameters or logic in the contract. This can be mitigated by adding relatively high fixed number to the gasPrice for dynamic CALLs where the recipient can only be determined by actually running the code and use real benchmarking for static CALLs, where the recipient can be known by static analysis of the bytecode. This can be weighted by the amount of gas transferred in the CALL.

vbuterin commented 8 years ago

how will backwards compatibility work for old contracts?

EIP 90 ensures calls won't get broken; otherwise, this is indeed a compatibility-breaking change, and that's how it must be: if old contracts were somehow grandfathered in, then they could be used for ongoing DoS attacks.

obscuren commented 8 years ago

:+1: for a gas-price-only HF.

alexvandesande commented 8 years ago

@CJentzsch couldn't price per codehash be used for miners as a way to censor some contract categories? I fear it could open up some DDoS possibilities like the soft fork attempt did.

CJentzsch commented 8 years ago

@alexvandesande Miners could censor anything they want currently in there blocks, but yes, this would make it even easier for them. This suggestion is not a protocol update, and the DDoS possibilities for the soft fork were different. Since there miners did run code, without including the transaction and therefore they would not get any payment. My suggestion would let miners run a benchmark on contracts after they got created in a valid transaction. For transactions which create a contract they could set a higher gasPrice in order to pay for the extra computation which comes from benchmarking them. The benchmarking would be very limited, since dynamic calls, some execution branches and other scenarios are hard or impossible to benchmark in advance. For all those contracts one would use the standard (max) gasPrice, but for others (such as transfer of tokens and other highly used code executions) one could exactly know how long it takes to process them and charge accordingly. This would lead to a fairer price model and most likely reduce the costs for often used code executions. But having a correct pricing of opcodes is of course much better, but could be done in parallel.

vbuterin commented 8 years ago

Market-based gas cost schemes are cool, but I personally consider them too complex for a hard fork that should be scheduled to happen quickly so as to resolve present issues.

Arachnid commented 8 years ago

Version 2, I think, overestimates the cost of contiguous reads, and underestimates the cost of seeks. Version 1b seems like a good idea to me, and I very much like the manner in which it can limit call depth without resorting to a fixed recursion limit.

Regarding backwards compatibility, it's important to note that Solidity does not currently cache storage reads in contract memory or on the stack - presumably because they're so cheap. There will be a lot of contracts out there that are suddenly a lot more expensive because of this.

CJentzsch commented 8 years ago

@vbuterin I agree, my market-based gas costs suggestion does not need a protocol change and therefore does not need to be implemented with this hard fork.

But I would suggest to increase all gas costs by a multiplier between 100-10000. This EIP is breaking backwards-compatibility since it increases gas costs. There will most likely be future price optimizations. By increasing the gas costs for opcodes by several magnitudes, the next price optimization only need to reduce prices instead of increasing them and therefore would not destroy backwards compatibility. And since only relative differences matter and not absolute numbers, I think it should not be a problem.

koeppelmann commented 8 years ago

If we can't figure out a market based solution why not use the same voting mechanism we use for the block gas limit. Every miner can (optionally) vote gas costs for each opcode up or down in the range of +/- (1/1024) (rounded up the full integer obv.).

I guess we all agree that hard coding the costs is always a guessing game. There is no way to know today wether in x years storage, bandwidth or computation will be the bottleneck.

Arachnid commented 8 years ago

Every miner can (optionally) vote gas costs for each opcode up or down in the range of +/- (1/1024) (rounded up the full integer obv.).

I'm concerned that that would facilitate new types of attack, particularly if most miners aren't voting. And it would certainly make optimising compilers a lot more complicated.

I guess we all agree that hard coding the costs is always a guessing game. There is no way to know today wether in x years storage, bandwidth or computation will be the bottleneck.

That doesn't make it a guessing game; it just means that it needs periodic revision.

chfast commented 8 years ago

For 1b I would use the formula g - (g // 64) instead of (g * 63) // 64 to avoid overflows in implementations. More on that in #114.

jwest411 commented 8 years ago

rough idea here but have y'all thought about a market based idea where you set each value to vitaliks values as starting values then have gas prices change relative to how much opcode has been called vs historical avr?

Way this could work is prices change based on how often an op code is called per ____ duration / a "normal" threshold. so if the network calls something 100x suicides a minute, over the past yr before the dosses we see 60x per minute, price of suicide = vitaliks value * 100/60 * some constant we set. So that way If codes are used less they go down in price too.

We can tweak this however you want. You can build a curve of rate of change use of opcodes, and have the "expected" amount be something you derive based on the rate of change use until now.

jbaylina commented 8 years ago

It would be good to penalty only the first time that you access an external contract but then low the gas costs in the subsequent calls. This would be more adapted to the cache behaviour of clients, and programs that use intensively a library or a related contract will not be affected so much.

etscrivner commented 8 years ago

Have we considered a mitigation strategy for possible replay attacks against Exchanges?

Smithgift commented 8 years ago

@etscrivner: It is highly unlikely that this fork would cause such issues.

A replay attack is only meaningful if there is a user who is on one chain who does not wish to have his transaction copied on another chain. If the other chain is dead, then there's no need to worry about replay attacks.

This hardfork would create two chains, one of which (the old one) will be DoSable, will likely have zero first-party client support, and will very likely die immediately. Unless there is an organized effort to sustain as a separate blockchain, exchanges can simply upgrade their clients and move on.

etscrivner commented 8 years ago

@Smithgift: Unfortunately, I don't think armchair reasoning about incentives has really been an accurate predictor of reality up to this point. So you'll have to excuse me if I don't immediately buy that. As a simple alternative, I'd propose that rather than continue relying on a failing strategy, we instead make transactions in this HF backwards incompatible to avoid replays? I'm open to alternatives, but this is one relatively simple way to fix the issue.

Refusing to entertain the possibility of the old chain surviving is just negligent and puts exchanges and individual funds at risk.

aztriker commented 8 years ago

food for thought -- Survival of intent. The intent of an attack may never be truly known. If the network can be defined as to permitted component behaviour - based on underlying component command level, individual network component level responsibilities - then one might develop a offense and defense with code appropriate to the need. All network levels cascade against attacks that are not properly dealt with by the assigned code level. irrespective of intent

pdaian commented 8 years ago

I actually agree with @etscrivner. This is clearly a motivated attacker seeking widespread system damage. If they can convince any two exchanges to run different fork versions (1b vs 1c, or both on the same exchange, or similar) then fork each to their own chains through a suicide, the result will be messy.

Imagine someone like Coinbase adopting support for 1b, then getting 1c tokens stolen from them which then plummet to 0. The attacker is clearly well resourced and savvy; if they manage to prop up 1c they can then claim Coinbase's liability for 1c losses.

This is either going to require perfect communication with large stakeholders or should include replay protection.

I wouldn't lean on the change being non-controversial because the attacker has a clear incentive and quite possibly a significant ability to create controversy.

Arachnid commented 8 years ago

@pdaian Nobody will be releasing clients that support different versions of the hard fork. The choice for users will be between running an updated client or not.

pdaian commented 8 years ago

@Arachnid that was not my understanding from reading https://github.com/ethereum/go-ethereum/pull/3111 and https://github.com/ethcore/parity/pull/2591. If my understanding is out of date I apologize. Edit just read the code again, it seems like both implement 1c. My bad, plz ignore :).

Although I do think for the future the best approach is to offer all plausibly optimal forks with well-versioned replay protection and let the market decide, offering only one option at a time seems fine until such the supporting infrastructure exists to do that smoothly.

Smithgift commented 8 years ago

I admit I did believe that the Classic chain would be irrelevant, and it was not. I continue to find it unlikely that there will be a surviving alternate chain from this fork.

Suppose we consider backwards incompatibility anyway.

Speaking in complexity terms, setting gas costs to different numbers is trivial. Changing gas call semantics is less trivial. Making transactions backward-incompatible is non-trivial. If you head over to the Classic community, where they are actively working on replay prevention, you'll see that there have been multiple proposals discussed. As far as I am aware, they have yet to decide upon and implement one. And this is for a good reason: it's a very large change and shouldn't be rushed.

This is, ultimately, a quick fix. If we are going to wait however many weeks for a mutually acceptable replay prevention system--a thing I am indeed in favor of--we are going to be waiting those weeks with the DoS vulnerabilities. I don't think this is the time for that kind of change, when the fix is needed now.

pdaian commented 8 years ago

@Smithgift worth noting that if the ETC project had to implement a change like the one on this PR, it would also take them weeks. Replay protection is generally simpler than gas mechanics, and this is a much bigger and more efficient dev team than ETC has.

Also, there is no need to make transactions not backwards-compatible; simply preventing transactions issued by the new clients from being cross-compatible would be enough if multiple forks are offered. If only one fork is offered I agree that it is unnecessary due to low probability of old chain survival.

Should be on the agenda anyway for future fork preparedness.

Smithgift commented 8 years ago

My apologies for any edginess in my previous post. This is a stressful time, and I'm stressed for other reasons as well.

@pdaian: I'm not sure how replay protection is simpler. Gas cost changing is just changing a number. The simplest replay prevention (Morden vs. live) involves using a different set of nonces. To do that here would involve changing the nonce of every account--a rather blunt method. Alternately, you could change part of the signing algorithm, which IIRC is one of the Classic proposals. This EIP has a long discussion of a particularly elegant method. @vbuterin suggests another one here.

I agree that this is a very desirable feature--and critical for the next massive hardfork.

pdaian commented 8 years ago

@Smithgift it depends on the proposal but I'm talking about simpler from an implementation perspective, once the design is fixed. For example @vbuterin's proposal would (in my estimation) not require an 1800 LoC sized diff touching 24 files or a 450 LoC diff touching 22 for its minimal consensus implementation (you could do cooler things like fork detection that certainly would though) (and yes, I understand this is not a minimal change; my point was that implementing a minimal anti-replay change on top of an 1800 LoC diff doesn't double the work required or more).

I do like EIP134 a lot. Unclear to me how well it would work across transitions and it's less explicit, but it is change and forget so that's nice. I wasn't aware of so many disparate proposals so perhaps the complexity comes with choosing one.

Smithgift commented 8 years ago

@pdaian: Fair enough. Complexity in choosing was what I referred to. Gas prices can be Starcraft-balanced to a better number if need be, but choosing a replay protection system is more permanent. (Though, if the Classic community chooses one, I would strongly recommend choosing the same system.)

We must also consider that any tool that deals with transaction signing, even if it never touches consensus, will potentially be affected by a replay protection change. A backwards-compatible scheme being a partial exception (the tools will still need to eventually switch to the new format.)

axic commented 8 years ago

@chfast

But as I understand it, the side effect of the will be lowered gas cost of storage modification (non-zero value to non-zero value

That was only an oversight in the description, there's no intention to make it cheaper (and I think this discussion doesn't belong here :smiley:)

rebroad commented 8 years ago

All these "magic" numbers seems a bit "trial and error"... Is there not a way to let the numbers be determined organically in some way?

gcolvin commented 8 years ago

@rebroad https://github.com/ethereum/EIPs/issues/157

chfast commented 7 years ago

Can someone make a properly formated EIP out of it and put it in the repo?

axic commented 7 years ago

Can someone shed the light what was the final revision used in the HF?

Is it 1c above (including everything form 1b and 1) at block 2463000?

cdetrio commented 7 years ago

This EIP is now located at https://github.com/ethereum/EIPs/blob/master/EIPS/eip-150.md. Please go there for the correct specification. The text in this issue may be incorrect or outdated, and is not maintained.