zawy12 / difficulty-algorithms

See the Issues for difficulty algorithms
MIT License
107 stars 25 forks source link

Comparing ETH's simple DA to Digishield and EMA #46

Open zawy12 opened 4 years ago

zawy12 commented 4 years ago

Takeaway: ETH's new DA for Homestead works exactly like a good EMA despite only adjusting in tranches, enabling its use in situations where difficulty (timestamp) grinding may be a problem. It allows 75% more blocks than normal if there is a selfish mining attack. A safer version of it can prevent this. After inverting the ETH algorithm to prevent a problem described below, the similarity between EMA, ETH, and Digishield can be seen as follows. Where Digishield's N is a lot smaller because avgST is a lot more stable than ST.

// Similarity between algorithms
// Floating point math except for int().
// ST = solvetime for previous block
// T = target solvetime
// N = factor to slow down speed of response.

EMA_Target = prevTarget * ( 1 + ST/T/N - 1/N )
ETH_Target = prevTarget * ( 1 + int(ST/T/ln(2) )/N - 1/N )  //  ETH's Homestead after fixing it. 
Digishield_Target = avgTarget * ( 1 + avgST/T/N - 1/N )

ETH's actual code and explanation are here and here. I use floating point here for clarity.

ETH's original DA in the Frontier release used the simplest possible DA that says "if solvetime was fast, increase difficulty by a small percent, else if it was slow decrease it by same amount."

// ETH's Frontier DA
if (solvetime < 13) { D = priorD*(1.0005); }
else { priorD*(0.9995); }

This simulates more complicated algorithms such as SMA's (including DGW) and Digishield but the small correction factor means it's a lot slower than most algorithms. The newer ETH algorithm is a "discrete" simplified EMA, which they seem to have come across by intuition and experiment. BTW, ETH's and EMA can't be used in coins like Cryptonote clones which simply assign the completion time of the previous block as the timestamp of the current block. It results in substantial oscillations.

ETH's Frontier DA does not result in a 13-second avg solvetime because half of a Poisson distribution's solvetimes are < ln(2) = 0.693 of the target solvetime. So the above results in an average 13/0.693 = 18 second solvetime.

A problem occurred with miners assigning only +1 second timestamps so that if there was competition with a simultaneous solve (15-second solvetimes will have a lot of "collisions" in solves due to propagation delays) the miner wants his block to win by having a very slightly higher chain work by claiming his solvetime was a lot faster than it was. This did not change the median solvetime, but the mean solvetime could increase a lot especially if >51% of miners did it. To quote devs for the new DA:

The difficulty adjustment change conclusively solves a problem that the Ethereum protocol saw two months ago where an excessive number of miners were mining blocks that contain a timestamp equal to parent_timestamp + 1; this skewed the block time distribution, and so the current block time algorithm, which targets a median of 13 seconds, continued to target the same median but the mean started increasing. If 51% of miners had started mining blocks in this way, the mean would have increased to infinity. The proposed new formula is roughly based on targeting the mean; one can prove that with the formula in use, an average block time longer than 24 seconds is mathematically impossible in the long term.

The new DA does not stop them from assigning +1 timestamps, and doing so still has the effect the miners desire, but now they can assign anything <10 seconds to have the same effect so they do not need to skew their solve time as much. It only fixes the mean solvetime. However, their fix allows the opposite potential harm of faster average solvetimes as the bottom "attack" section shows. The new DA is

// ETH's Homestead DA
// ST = solvetime as reported by miner
D = priorD * ( 1 + 0.0005*(1-int(ST/10) )

// General form:  
// T= target solvetime
// N = "window size" (aka slowness aka  stability factor)
D = priorD * ( 1 + 1/N*(1-int(ST/T/ln(2)) )

// to see this more clearly:

if (ST < 10) { factor = 1.0005 }  // ST too fast, increase D.
else if (ST < 20) { factor = 1 }  //  ST about right, do nothing.
else { factor = 1-0.0005*int(ST/10) }  // ST too slow, decrease in proportion to slowness.
D = priorD*factor

This method with constant hashrate has an theoretical (experimental with constant hashrate) average solvetime just below 15 seconds.

Making ETH's Frontier as fast as Digishield

Digishield's math "dilutes" the effect of its solvetimes by 4, so its window of 17 blocks acts like an improved SMA with N = 17*4 = 68. Digishield responds quickly due to the 17 but the response is moderated by the dilution factor of 4. For a given speed of response to typical hashrate changes < 10x, it is smoother than SMA, which makes it overall a lot better, but for sustained changes in hashrate it is a lot slower, taking 500 blocks to reach the correct difficulty after genesis. Since ETH changes per block, 1/68=1.5% indicates ETH's simple method can simulate Digishield with 1.5% per block changes. Experimentally I see 2% looks very much like Digishield.

Digishield is this:

// Digishield
// Ignoring the fact that the solvetimes are shifted MTP (~6) blocks in the past.
// T = target solvetime
// ST = solvetime

nextTarget = avg(17 prior Targets) * ( 0.75*T + 0.25*avg(17 prior STs) ) / T

// In terms of difficulty the following can be used, but to be exact 
// use the harmonic mean of the prior Ds instead of average.

D = avg(17 prior D) * T / ( 0.75*T + 0.25*avg(17 prior STs) )

// Notice that rearranging & using the approximation 1/(1+x) =~ 1-x for small x
// can show a similarity between Digishield and ETH.  x isn't very small for Digishield here,
// but it's not going to be > 0.5 very often. In testing under various conditions, I could not
// determine which of these forms of Digishield was better.  Note that avgST is a lot more
// stable than ST, so the 25% adjustment factor can be a lot bigger.

D = avgD / (1 + [-0.25 + 0.25*avgST/T] )
// apply 1/(1+x) =~ 1-x approximation
D = avgD * (1 + 0.25*[1 - avgST/T])    // "inverted Digishield"

ETHs Frontier algorithm can do a good job of simulating Digishield by changing 2% per block instead of 0.05%

// ETH's Frontier with 2% instead of 0.05% jumps per block

D = priorD * (1.02 if ST< T/ln(2) else 0.98)

// or in terms of target: 
nextTarget = priorTarget * ( 0.98 if ST< T/ln(2) else 1.02)

Comparing ETH's Frontier to Digishield

The above two difficulty algorithms look so identical when hashrate is constant, I'm not even including the plots. Both have about 0.11 Std Dev (trying to get the same value is how I chose 2%). The charts below compare them with 1.6x hashrate attacks that start when difficulty falls below average and stop when it is 10% above average. The "delay"and "blocks stolen" metrics appear worse for this modified ETH Frontier DA (aka 2% change per block algorithm), but some of it is because it is being subjected to more attacks as a result of getting back to the correct difficulty faster. Also, the average solvetime is 3% higher than target under this miner-motivation scenario while Digishield will remain accurate under all conditions. All algorithms can appear better to the degree the solvetime is allowed to be compromised. In short, Digishield is definitely better, but a casual view of the charts can't see much of a difference. These charts are 1200 blocks.

2% jump refers to ETH's Frontier algorithm with a much faster "response speed constant".

image

Comparing ETH's Homestead to Digishield

The above 2% Frontier algorithm can be modified to give a fast version of ETH's Homestead. I'm not including the max(x, -99) rule because it will not occur in normal circumstances. Homestead gives higher weight to longer blocks, enabling it to drop faster after long delays. This would seem to make the average solvetime faster, but the average solvetime stays accurate (like Frontier) because solvetimes (STs) from 0.693T to 20.693*T do not decrease difficulty as they do in Frontier. It's surprising this simple method still gives a pretty accurate target solvetime T.

// modified 2% jump DA 
// This is a fast version of ETH's Homestead DA.

D = priorD * ( 1 + 0.02*(1-int(ST/T/ln(2)) )

It looks more unstable than Digishield with constant hashrate as shown below, but the Std Dev in difficulty is the same. The small error in average solvetime is an offset error that can be easily fixed by a slightly higher T.

mod 2% jump refers to ETH's Homestead algorithm with a much faster "response speed constant".

image

It has the same response speed to increases in hashrate as the previous 2% jump algorithm, but it falls a lot quicker after attacks. In one sense this might make dedicated miners happier, but it can result in more oscillations if the attacker keeps coming back ASAP as the chart below shows. Homestead looks worse than Frontier, but this is all because of the method of testing. A new attack begins a lot more quickly because the difficulty dropped. Notice that despite more frequent attacks the metrics are the same. And the avg solvetime is low, which means it can be modified to make the attacks less frequent. So the Homestead algorithm is better than Digishield. My next example will make it more apparent.

image

Comparing ETH's Homestead to EMA

I have long been a fan of the EMA difficulty algorithm which I explore in issue 17

It turns out that ETH's new DA is exactly a "discretized" version of the simple EMA. The EMA is a modified version of the exponential moving average used in stocks. The simple EMA uses the approximation e^x =~ 1+x for small x (as the link to issue 17 above details).

// Simplified EMA  (JE's EMA)
// A continuous version of ETH's Homestead DA (this is floating point math)
D = priorD * (1 + 1/N*(1 - ST/T)

To get DAs on equal terms when comparing their speed of response to hashrate increases, I first adjust their main parameter ("window size" in most DAs) to have the same Std Dev stability to a constant hashrate. Doing this to make the simple EMA like the 2% method above, I needed N=35 for the EMA. This results in:

// Simplified EMA to compete with Digishield and mod 2% jump (continuous ETH)
// N = 35
D = priorD * (1 + 0.0285*(1 - ST/T)

To my surprise, the simple EMA is indistinguishable from ETH's homestead. The following is a more typical attack scenario seen in Zcash clones using Digishield than the example above, this time using a 3x attack starting when D is a little below average and stopping when it has risen 30%.

image

Timestamp Attacks on ETH

The ETH and mod 2% DA shown above should not be used. They allow for the possibility of negative difficulties (e.g. if ST is a very long 36xT in my 2% version of Homestead). If that is patched as in ETH with a max(-99), a private mining attack can still get an excess of blocks, above and beyond a normal selfish mining attack. ETH's is strongly protected with the small 1/2048 =~ 0.05% change per block and max(-99), but an attacker with > 50% can still do a selfish mine and get 75% more blocks than the public chain. He can do this by assigning timestamps 1000 seconds apart for 60 blocks then assigning +1 timestamps for the next 7720 blocks and then send the chain to the public network. Attacker gets 7780 blocks while the public chain gets 4450 if attacker has only 51% (a hashrate equal to the public chain). If his profit margin is normally 50%, he makes 7780/(4450*0.50*0.51) = 7x more profit from this attack.

For the same reason, if 66% of ETH's public miners assign +1 timestamps, solvetime will reduce to 11.5 seconds, causing too many blocks per time to be issued.

Most coins need a faster response to hashrate (2% change per block instead of 0.05%) which can be more easily attacked with greater damage than the above because max(-99) can't be used. The smaller the 99/2048 ratio, the less the attack can profit. But with N=35 instead of 2048, that would block the 1.75xT solvetimes.

The best way to stop this attack is to use the simplified TH-EMA, or the accurate JE-EMA. I've shown how they can be discretized.

// Simplified TH-ema
// floating point for math clarity
D = priorD * N / (N+t/T-1) 

// Inverted form of ETHs Homestead to protect against attacks & negatives
// Same as the Simplified TH-EMA above, but "discretized"
D = priorD * N / (N+ int(t/T) - 1)

//  *********
// Same as above with integer math. This is the algorithm people 
// should use if they want ETH's Homestead without the problems.
// The more complicated version at bottom might be more accurate.
D = (priorD*N*1000/( N+ int(t/T) - 1) ) /1000
// **********

// Accurate JE-ema (floating point for clarity)
k = t/T/N
D = priorD * (1+1/N+k*(k/2-1-1/2/N))

// ETH's Homestead non-inverted & more precise to prevent
// negatives and attacks. This is floating math for clarity.
k = int(t/T)/N 
D = (same as above)

// Same as above, but in integer math:
// N<100 for <1% error
// D might overflow in some coins
k = 1E4*int(t/T)/N;
D = (priorD * (1E8+1E8/N+k*(k/2-1E4-1E4/2/N)) ) / 1E8