atan in difficulty algorithm is potentially non-deterministic and has undesirable difficulty effects

jagerman commented 6 years ago

The v9 fork at block 68000 added an atan adjustment atan to try to bring the blockchain difficulty down faster, to try to solve what ended up being a completely separate issue stemming from Monero code (#116 / #118 - though the precise issue and fix were not known at the time of the v9 fork).

The added atan adjustment, however, has some major issues and should be removed.

The first issue is that IEEE floating point operations only requires precisely defined results for +, -, *, /, and sqrt. (Although C++ doesn't even technically require IEEE math, it's probably safe enough to assume to be present anywhere a graft node will run). atan, however, isn't required the have the same value under all libraries/compilers/compiler versions/compilation flags. This is a potential bomb waiting to go off. It's very rare, but not impossible: and potentially devastating if it ever occurs.

The main problem is that it is possible for different c++ runtimes to produce slightly different values for the the result of an atan. Suppose, for example, that MSVC's C++ runtime and Linux's stdlibc++ libraries produced slightly different values in the atan call for some particular input value, and that those values just exactly put the RHS of the following calculation on either side of an integer boundary on the two systems (so that the integer target variable here ends up being some value x on one system and x+1 on the other):

    target = adjust * (((length + 1) / 2) * target_seconds);

If this ever happens the result is a permanent chain split that is inherently unresolvable: neither side of such a split would ever accept the chain accepted by the other side because the other side is offering blocks with an invalid difficulty. Even if, say, all Windows users deleted the block and resynced they would never accept the chain being offered by linux systems if such a deviation ever manifests, and vice versa. (The only potential ways I could see to clean it up would be to issue a new release that hard codes the difficulty for the specific block height where the deviation occurred).

This code is dangerous and needs to be eliminated or, at the very least, replaced with a value carefully constructed with primitive IEEE safe operations.

Before worrying about that, however, there is a deeper issue here (first brought up by @zawy12 in #119): the atan is serving little practical purpose while inducing considerable noise into the network difficulty. Let me explain by starting with the implementation:

    double derivative = 0;
    if (length >= 4 && timestamps[length - 1] - timestamps[length - 3] > 0) {
      double d_last = 1.0 * (cumulative_difficulties[length - 1] - cumulative_difficulties[length - 2]);
      double d_prev = 1.0 * (cumulative_difficulties[length - 3] - cumulative_difficulties[length - 4]);
      double h = 1.0 * (timestamps[length - 1] - timestamps[0]) / timestamps.size();
      if (h > 0) {
        derivative = (d_last - d_prev) / h;
      }
    }
    // adjust = 0.99 for N=60, leaving the + 1 for now as it's not affecting N
    double adjust = 0.9909;
    if (derivative < 0) {
      adjust *= 1 + std::atan(derivative) / (10 * M_PI);
    }

For notation, let me call B the most recent block, B-1 the block before that, etc. So this code calculates derivative as the difficulty of block B minus the difficulty of block B-2 (yes, really: this looks like a bug—the blog post description of the algorithm refers to the change in difficulty from one block to the next), divided by the mean block time over the last DIFFICULTY_BLOCKS_COUNT_V8 blocks.

To put that value into context, let's suppose that the network is perfectly stable with just some ordinary random perturbations that induce a tiny 0.01% difficulty decrease between this block and B+2 (which should really be B+1; see above). That tiny shift means difficulty decreased by 290618.

So we end up with derivative = (2906174509 - 2905883891) / h, with h = 5993.0/60 = 99.883. Thus derivative = 2909.5745.

This is a fairly pointless number to stick into an atan: it gives a value equal to 99.98% of the maximum value that atan can ever return (i.e. π/2). And this was just for a miniscule downward fluctuation in the hashrate. In fact, with 120 second average blocks, just to get this value down below 90% of its π/2 asymptotic value would require diff to drop by less than 1000 (which is for all intents and purposes is nothing since network diff is around 3 billion). So really this whole line:

    adjust *= 1 + std::atan(derivative) / (10 * M_PI);

is, for all but the most miniscule drops in difficulty, essentially the same as the far simpler expression:

    adjust *= 1.05;

except that the latter doesn't allow for a potential chain split.

But a bigger issue: as zawy pointed in in #119, adjust *= 1.05; just seems wrong: the difficulty is going to be adjusted downward by nearly 5% whenever it drops between two blocks by .00001%. That adds a substantial amount of noise to the network hashrate. Tiny downward fluctuations (which even in a perfectly flat actual hashrate will happen 50% of the time) drive difficulty down by 5%, which introduces substantial noise into the difficulty (see @zawy12's charts here: http://wordsgalore.com/diff/index2.html — graft's difficulty is consistently more noisy than other coins with a comparable network hashrate).

But there's still another issue here: not only are we getting extra noise, that noise is asymmetric (in stats terminology, it does not have a mean of 0): the adjustment is only ever applied to diff drops by never to diff increases. What this means is that it actually makes graft mining emissions consistently around 2% too fast. Here are the elapsed times for 720 block emissions over the last 21600 (= 30 * 720) blocks:

112728 -- 113448: elapsed time = 84834s = 98.19% of 86400 target
112008 -- 112728: elapsed time = 84196s = 97.45% of 86400 target
111288 -- 112008: elapsed time = 85310s = 98.74% of 86400 target
110568 -- 111288: elapsed time = 83912s = 97.12% of 86400 target
109848 -- 110568: elapsed time = 85956s = 99.49% of 86400 target
109128 -- 109848: elapsed time = 85095s = 98.49% of 86400 target
108408 -- 109128: elapsed time = 83844s = 97.04% of 86400 target
107688 -- 108408: elapsed time = 84467s = 97.76% of 86400 target
106968 -- 107688: elapsed time = 85635s = 99.11% of 86400 target
106248 -- 106968: elapsed time = 84299s = 97.57% of 86400 target
105528 -- 106248: elapsed time = 83698s = 96.87% of 86400 target
104808 -- 105528: elapsed time = 84547s = 97.86% of 86400 target
104088 -- 104808: elapsed time = 84570s = 97.88% of 86400 target
103368 -- 104088: elapsed time = 85514s = 98.97% of 86400 target
102648 -- 103368: elapsed time = 83921s = 97.13% of 86400 target
101928 -- 102648: elapsed time = 83186s = 96.28% of 86400 target
101208 -- 101928: elapsed time = 87332s = 101.08% of 86400 target
100488 -- 101208: elapsed time = 85953s = 99.48% of 86400 target
99768 -- 100488: elapsed time = 83647s = 96.81% of 86400 target
99048 -- 99768: elapsed time = 85028s = 98.41% of 86400 target
98328 -- 99048: elapsed time = 84785s = 98.13% of 86400 target
97608 -- 98328: elapsed time = 82901s = 95.95% of 86400 target
96888 -- 97608: elapsed time = 85375s = 98.81% of 86400 target
96168 -- 96888: elapsed time = 85975s = 99.51% of 86400 target
95448 -- 96168: elapsed time = 83778s = 96.97% of 86400 target
94728 -- 95448: elapsed time = 84923s = 98.29% of 86400 target
94008 -- 94728: elapsed time = 85096s = 98.49% of 86400 target
93288 -- 94008: elapsed time = 84059s = 97.29% of 86400 target
92568 -- 93288: elapsed time = 85821s = 99.33% of 86400 target
91848 -- 92568: elapsed time = 84251s = 97.51% of 86400 target

and so on (full list at https://jagerman.com/graft-emission-rate.txt; source code to generate it at https://jagerman.com/graft-emission-rate.py). From block 68000 (i.e. the v9 fork) to 113448 (the current top block as of this writing) the mean block time is 118.02 seconds -- i.e. 1.7% faster than the target 120. This isn't caused by a few outliers, but is (as you can see above) consistent day after day.

(Edit: this has gotten a bit worse since I originally wrote it; the average is now more like 2.5% too fast; the emission link above has been updated with results up to block 167788).

This is happening due to the asymmetric nature of the atan adjustment: it is only applied on diff falling, not diff rising. From a theoretical point of view, the bias here is not unexpected: imagine actual network hashrate being a constant value with fluctuations in the difficulty calculation due to the randomness of finding blocks. Every time the fluctuation happens to be negative (because too few blocks were found recently due to bad luck), diff drops by roughly 5%. The diff algorithm then typically will find blocks being generated too fast and so, over a few blocks, adjusts difficulty back up to the true network hashrate. But now if you calculate the mean "network hashrate" (i.e. computed as difficulty * 120) you would find that it is too low: there is a downward effect applied to some blocks without any corresponding symmetric upward effect applied to others. That is exactly what we see happening, and so we get a little under 2% too many blocks found per day.

Now if you take that blockchain behaviour and add NiceHash, the fluctuations become a little worse: NiceHash bots jump in when diff is low, mine at artifically high profitability for a few blocks, then the diff is much too low and has to jump up to compensate. But when it comes back down again, the 5% adjustment is going to bring it down too fast, thus inducing the same cycle again. While NiceHash bots would do this sort of thing anyway, this 5% makes that drop faster than it needs to be and opens up larger opportunistic NiceHash windows than would occur without it.

TL;DR: the atan term added in the v9 diff adjustment makes difficulty fluctuate too much, makes the mining emission rate about 2% too high, and induces more opportunistic NiceHashing.

Since the v9 change, at least as currently implemented, isn't buying the network anything except for encouraging more NiceHashing and less stable network difficulty, coupled with the fact that there is a theoretical (if very low probability) catastrophic failure, I'd like to suggest that graft ditches the atan (and the 1.05 idea) and instead, at the next fork, goes back to the v8 difficulty algorithm with just the FTL fix applied.

Or, if you really, really want to keep it for some reason, at least replace it with a deterministic atan approximation, correct the domain of values passed into the atan, and make it symmetric.

sebseb7 commented 6 years ago

" has some major issues and should be removed"

maybe explain what the implications for the coin are. Fluctuating hashrate is a "problem" not visible at the market.

jagerman commented 6 years ago

I've created a histogram of block solve times since block 100000 here, also showing the ideal times (i.e. what we'd expect under a perfectly stable actual network hashrate and ideal difficulty):

You can see the full interactive histogram here:

https://jagerman.com/graft-solvetime-hist-100000_167788.html

(the source code to generate it is at: https://jagerman.com/graft-dist.py)

Of particular note is the blue area (i.e. actual solve times) being above the ideal up until 60s and beyond about 400s:

The take-away message: we have too many short blocks (under a minute) and long blocks (over 7 minutes), and far too many very long blocks (over 15 minutes).

In particular:

Blocks <= 0.2min: 8263 (12.2%); expected: 6451.0 (9.52%) Blocks <= 0.5min: 19604 (28.9%); expected: 14994.9 (22.1%) Blocks <= 1min: 32403 (47.8%); expected: 26672.9 (39.3%) Blocks <= 2min: 46306 (68.3%); expected: 42850.8 (63.2%) Blocks >= 3min: 14073 (20.8%); expected: 15125.8 (22.3%) Blocks >= 4min: 9446 (13.9%); expected: 9174.2 (13.5%) Blocks >= 5min: 6415 (9.46%); expected: 5564.5 (8.21%) Blocks >= 6min: 4506 (6.65%); expected: 3375.0 (4.98%) Blocks >= 8min: 2200 (3.25%); expected: 1241.6 (1.83%) Blocks >= 10min: 1119 (1.65%); expected: 456.8 (0.674%) Blocks >= 15min: 212 (0.313%); expected: 37.5 (0.0553%)

i.e. about twice as many blocks as we should see over 8 minutes and more than 5 times as many as we should see over 15 minutes.

I should point out that not all of this is due to the v9 difficulty adjustment: we'll still have opportunistic miners (e.g. NiceHash or MoneroOcean) even without it, but the adjustment is making the problem worse.

zawy12 commented 6 years ago

Are you sure the atan is not the sole cause of the deviations? (in reference to a comment you made in the other issue, indicating removing the atan would not completely fix it) I'm going to send you the same number of blocks from Masari, a smaller coin that should be less stable, and let's see how it's distribution fairs.

zawy12 commented 6 years ago

Is there an equation for your smooth line? This might enable a good metric for measuring how bad hash attacks are, or general algorithm performance. Maybe do a root-mean-square of the difference in the bins.

Below shows Masari's results with LWMA on the last 50,000 blocks (since their last fork) which shows much better results. Before this fork, their results were similar to the Graft's. The difference is that they implemented a new POW to stop bad attacks.

jagerman commented 6 years ago

Are you sure the atan is not the sole cause of the deviations?

I doubt it is: there's still the possibility of doing the coin hopping without it. But it requires being able to bring a considerable amount of hash power (i.e. NiceHash) online and offline quickly. Without the atan the size of drops should be reduced, reducing the incentive to carry it out, which in turn will feed back into reducing the size of the fluctuations.

Masari has been on CN-fast since their last fork which currently isn't nicehashable, but I have been hearing rumours that CN-fast is coming to NiceHash soon, so that may destabilize Masari.

Is there an equation for your smooth line?

It's the probability density function of an exponential distribution, i.e. λ exp(−λx), scaled by 5*N to match the histogram (where 5 is the bin width of the histogram and N is the number of blocks). λ equals 1/T, where T is the target time (i.e. 120 for graft).

zawy12 commented 6 years ago

OK, thanks

jagerman commented 6 years ago

I've put together some DA simulation results comparing a standard LWMA to an LWMA with the atan adjustment. These 1000-block simulations includes an attack where a larger miner (with 3x the usual network hashrate) jump on any time diff drops to 80% of the regular network hashrate, and then mines until difficulty hits 120% of the network hashrate, then leaves. The blue line corresponds to the actual hashrate, the orange line traces the difficulty value generated by the difficulty algorithm:

https://jagerman.com/graft-da/diff-plot-lwma-N60_A-dip20_seed-5-log.html https://jagerman.com/graft-da/diff-plot-graftv9-N60_A-dip20_seed-5-log.html

There are 9 more simulations (using different random seeds) here: https://jagerman.com/graft-da

The difference is particularly apparent if you load two simulations with the same seed in two tabs and switch back and forth between them:

attacks are more frequent; this is because the adjustment is more likely to push diff down enough to trigger an attack.
ignoring attacks, difficulty is considerably less stable from block to block — this shows up in the graftv9 difficulty being considerably more "jagged" as it moves along. (Basically this happens because it kicks down the difficulty whenever it drops just a little, then in the next block LWMA has to correct it by pulling it back up).
again ignoring the attack portions, there is small but apparent downward shift of the difficulty line. The plain LWMA version is basically at or above the LWMA+atan version almost everywhere, and almost never below it. This is what is responsible for missing the block target times: LWMA is an unbiased network hashrate estimator, so anything (like LWMA+atan) that consistent comes in below it on average will be biased downward.

graft-project / GraftNetwork

atan in difficulty algorithm is potentially non-deterministic and has undesirable difficulty effects #133