tradecraftio / tradecraft

Tradecraft integration/staging tree https://tradecraft.io/download
Other
13 stars 9 forks source link

Investigate replacing difficulty adjustment algorithm #3

Closed maaku closed 4 years ago

maaku commented 5 years ago

Bitcoin's difficulty adjustment algorithm uses a simple average of solve times over a two-week window to estimate network hash rate. It has been demonstrated that this only works for the dominant coin of a proof-of-work system, and leads to instability when there is fierce competition with another similarly configured but not merge-mine compatible system (see: Bitcoin and Bitcoin Cash, initially; Bitcoin and Freicoin since 2013).

Freicoin hard-fork changed to a different difficulty adjustment algorithm in March of 2013 in order to partially resolve this issue. Instead of a two-week average, Freicoin switched to a 144-tap Parks-McClellan finite impulse response linear filter to extract a hash rate estimate from recent block solve times. The Freicoin algorithm worked in terms of getting past the existential crisis of a hash crash, but there is still substantial solve time variance due to the effects of majority hash power following mining profitability with respect to other coins.

With the adoption of forward blocks, we have a one-time opportunity to change the difficulty adjustment algorithm again, but this time as a soft-fork. As we are not responding to a crisis as we were in 2013, we should take the time available to investigate the options available and see if a better difficulty adjustment algorithm can be designed in order to further minimize the negative effects of multi-chain pool hoppers on interblock solve times / quality of service.

freeskaro commented 5 years ago

Hi Mark, So moving the discussion here. I will read a little on the subject.

freeskaro commented 5 years ago

Perhaps, you could show me where in the code the difficulty adjustment algorithm is?

freeskaro commented 5 years ago

I suppose, one would want to produce some historical data to study this problem: hashrates and difficulty levels.

1-Difficulty levels is recorded in the blockchain. I already have that parsed in a DB. We do also have block times. 2-Hash rates would have to come from mining pools? For a long time, there was only one. This can also be estimated from difficulty level and block times. 3-After that, one can see how the current algo performs and try new ones. 4-Price data may also be enlightening, and mining machine efficiency too!

The idea would be to try different sampling algos and adjustment rules. The goal is to keep block creation rate as close to 10 mins.

Something like that?

freeskaro commented 5 years ago

newplot (2)

That's a start

maaku commented 5 years ago

I'll post a proper response soon. Sorry it's in the middle of a holiday weekend here.

The code resides in pow.cpp, but what needs to be done isn't really the code. A new difficulty adjustment filter would be nearly a drop-in replacement for some of the constants in that file. Actually deployment would be more complicated since both would have to be supported in different contexts, kinda like how that file supports both the old-style bitcoin difficulty adjustments and Freicoin's more frequent difficulty adjustment algorithm.

What actually made me think of this issue when you mentioned your numerical analysis experience was the process that goes into designing a new filter. I go over the current filter and its design in this talk at Scaling Bitcoin Milan:

https://scalingbitcoin.org/transcript/milan2016/fast-difficulty-adjustment

For Freicoin we were working on an accelerated, emergency-fix timeline to deal with the hash crash in early 2013. Working from galambo's (fellow freicoin developer from the early days) suggestion as to the right filter to use, I picked a few parameters based on intuition and ran an optimization algorithm to select the gain, limiter, and adjustment frequency. For Scaling Bitcoin I was also under time pressure since I only had a few weeks to put it together, so I hand-tweaked a few of the pre-set parameters and then included the window size in the optimization process as well.

What we really ought to do, I feel, is (1) write an actual, reproducible search/optimization program to find optimal parameters; and (2) expand the search space to minimize built-in assumptions, even ones we have strong or moderate reasons for thinking correct. So instead of a Parks–McClellan filter, I'm thinking an evolutionary search over all possible filters.. to see if we end up with a Parks–McClellan filter. But a first step might be to use Parks–McClellan, but put the parameters (cut-off frequency, band gap) under optimization as well.

I'll write up more when I can, but maybe not until Monday.

maaku commented 5 years ago

A little bit more background:

Block solving is a random, stochastic process. Every single hash performed by a miner has a random chance of being successful. Therefore if miners honestly report the time in block headers (more on that later), the time between blocks with constant hash rate follows a Poisson distribution. Since hash rate is not constant, the base probability of finding a block (the "difficulty") needs to be adjusted periodically to maintain the same expected solve time. The algorithm for making that adjustment has as input the reported block times, and needs to estimate the underlying base block solving rate that generated the actual distribution of block times. In other words, how frequently blocks were actually being mined vs the desired target rate.

Bitcoin solves this by taking the straight average time between blocks over a very long period (2016 blocks, which is ideally 14 days). The larger sample size smooths out random deviations, and also prevents a minority miner from having much influence by lying in their timestamps, since time stamps are required to be bounded by a smaller average of prior block times, and within a few hours of wall clock time. One way of implementing this is to manipulate a vector of block times, in python/matlab pseudocode notation:

times = [block[-N].nTime, block[-(N-1)].nTime, ... block[-1].nTime] # block[-N] is the Nth block from the end
durations = block[1:(N-1)] - block[0:(N-2)] # vector of length N-1
expected = 10 min
actual = sum(durations) / N-1

(A faster way of doing this calculation is to simply subtract the last block time from the first, since the intermediate times cancel out in the sum, and this is what bitcoin does. But pedagogically the vector sum is correct, and I'm going somewhere with this. Also the last line actually would have N instead of N-1 in the denominator for bitcoin, an off-by-one error by Satoshi, but freicoin fixed this. Bitcoin actually targets ~600.6 second block intervals as a result, whereas Freicoin targets exactly 600 second blocks.)

Now the division by N-1 in the last line is renormalization, turning the sum into an average. It might make more sense to think of it as:

actual = sum(1/(N-1) * durations)

In other words, each duration has weight 1/(N-1), and since there are N-1 durations the sum of the weights is 1. But what if the weights are not uniform? Well there are methods for coming up with sets of weights that that represent the 2nd, 3rd, 4th, etc. derivative. You can write a bunch of feature-detection vectors and do a linear combination of them to emphasize whatever features you think are important. As long as the sum of the final weight vector is unity, it'll work (but some choices are better than others).

This approach of selecting a vector of weights to achieve some goal generates what is called a finite impulse response filter. One way of studying filters is by what frequency bands they pass through into the final result. By selecting a low-pass filter we are able to remove high-frequency noise--the random block-to-block variation in time--while preserving the general trend over the duration of the window. This allows us to use a smaller window, adjusted more frequently, without being hyper sensitive to the inherently random block finding process. Since we adjust quite frequently, in order to prevent rapid changes due to random variance or malicious miners we also apply a gain (reduce the adjustment by a fixed fraction per interval) and limiter (no more than +/- x% per adjustment).

There are also standard algorithms to use to generate low-pass filters. We used the Parks–McClellan algorithm to generate a low-pass filter, which takes as a configuration parameter a cutoff frequency and band gap. Those parameter were somewhat arbitrarily chosen, to be honest, and part of what I'd like to do is see how far from optimal they are. But also other algorithms entirely could be used to generate a filter, including straight up evolutionary or simulated annealing search over the space of all possible filters. The error function for such a search would be the root-mean-square of the difference between what the difficulty should be with respect to the actual hash rate (known to the stimulator), and the filter-estimated difficulty.

The goal of this issue then is to perform a much more thorough search over the space of possible filters (weights) and filter parameters (window size / number of weights, adjustment frequency, gain, limiter) and select the best filter. This new filter will then be deployed as part of the forward-block upgrade.

freeskaro commented 5 years ago

Okay. Sounds good. I haven't done too much with signal processing, but it shouldn't be a problem. I am just finishing reading Kraft's paper Bitcoin difficulty adjustments (https://www.domob.eu/research/DifficultyControl.pdf) .

The graph I produced above was in PHP. Clearly, that's not the best language to use. This weekend I'll move the code into MATLAB and start playing. As for objectives, trying to optimizes the parameters Parks-Mclellan filter would be a good goal to reach to understand everything. Then after that we can start with exploring other methods.

These evolutionary searches over the space of all possible filters, is this a kind of convolution function? well, I'll see.

I had an idea of maybe trying to find parameters using risk analysis methods: ie, an 80% probability that the block generation rate is < x, for example. But I have to get to know the problem.

maaku commented 5 years ago

Python with scipy is what I usually reach to for these things, rather than Matlab, if that helps. There's an easy batteries-included installer here: https://www.anaconda.com

freeskaro commented 5 years ago

okay. I can use python. Its already installed. Do i need specifically Anaconda?

maaku commented 5 years ago

No, that’s just a distribution that includes a bunch of scientific packages too. You will need numpy and scipy though.

freeskaro commented 5 years ago

Freicoin_arrival_diff

Well, I got that up in Python. Next step I gotta read up on Parks–McClellan.

Looking at the above graph for the last 1000 blocks (6.9 days), we really got our work cut out for us. But here, I'm estimating the the network hash rate per block based on the difficulty and arrival time of the block. So, even if the hash rate was constant, there would be some randomness. Nevertheless, the difficulty peaks are hurting the block arrival rate.

freeskaro commented 5 years ago

hashrate=difficulty*2^32/(blocktime])/10^12/600 , is that correct? Kraft seems to describe it differently.

I got that from here: https://en.bitcoinwiki.org/wiki/Difficulty_in_Mining. (update I see there is an error dividing by 600)

freeskaro commented 5 years ago

Figure_1

Okay here the hashrate is calculated based on the previous 6 block arrivals hashrate=difficulty*2^32/(blocktime for 6 blocks])/10^12/6

freeskaro commented 5 years ago

Okay, I have a question. It seems to me what you are doing is really extracting the zero-frequency signal from hash-rates: the constant value on the time-history graph.

Using a Parks-Mclellan filter, essentially the time series data is converted to frequency domains and the pass and stop bands choose which frequencies to keep. But a random process centered on 0 just produces another squiggly line in amplitude vs frequency domain. A random process not centered on 0 produces a spike at frequency equal 0 containing about 1/2 the observations.

So when you use this: scipy.signal.remez(36, [0, 0.05, 0.1, 0.5], [1, 0])

You are keeping all the signals that contribute to a frequency less than 0.05 and discarding the others, which is the zero-frequency signal, or the constant value in the time series plot.

Is that correct?

freeskaro commented 5 years ago

Figure_PM

Well, getting close. So far I think a moving 6 block average works better in estimating a network hash rate.

From observing the pools, the hash rates of 10 to 70 Th/s seem correct. When one uses a hash rate estimated on 1 block arrival, the range of values is incredible. I have to somehow check the results for my filter, but how such random peaks get interpreted in a Fourier transform and the eventual filter, I'm not sure.

maaku commented 5 years ago

Is your code on GitHub somewhere?

freeskaro commented 5 years ago

https://github.com/freeskaro/difficuly_algo

freeskaro commented 5 years ago

Screenshot (278)

I found an error in my code where I calculate hash rates averaged over many blocks incorrectly. You will find on my GitHub a test file where hash rates are generated ian exponential distribution (ie first arrival time)

Using the getnetworkhashps 252683 in the wallet I get some large numbers too. In the screen shot above, it can even produce negative hash rates.

freeskaro commented 5 years ago

Okay, work continues this weekend.

Here we study FIR filter using 'remez' for a generated arrival time using exponential distribution with 600s arrival rate. I use arrival rate in filter not hash rate. This was an error I made above. Using estimated hash rate in filter is horrible idea. Calculating hash rate involves multiplying a big number and dividing by a sometimes small number and can creating massive swings, too much for a filter to handle.

RMSE is observed (hashrate - filtered hashrate). The expectation is to reproduce a constant mean of 600s. Smaller pass filter the better. With 36 points on filter, one can go down to lower values than with 144. The error doesn't change a lot as a function of c1, c2. There are just less fluctuations. larger c1,c2 is maybe more stable.

RMSE36gen

Arrival rate fluctuate around 600s. generated_arrival_PM-36-005-05

Using another set of parameters:

generated_arrival_constant144

Clearly, the hash rate spikes are insane and must be calculated from filtered results. One also has to consider if this is an acceptable solution for what really should be a constant value.

freeskaro commented 5 years ago

So the last thing for today is to do make a plot using real Freicoin data. I'm not certain about this part. I take 2 weeks of block arrivals (2016), put it through filter, and calculate hashrates. I plot the last 200 blocks:

Freicoin_144

the predicted block arrival times appear skewed by 144 blocks (n in remez).The predicted hash rates appear to far exceed data from pools where the last 7 days didn't exceed 15THs

freeskaro commented 5 years ago

And this is the same thing using a simple 20 block average. I don't think the mining pools are reporting hash rate well. There is apparently a private pool. The results are quite similar to the graph above using the PM filter. The 20 blk average has a less phase shift.

Freicoin_arrival_2016_20a

freeskaro commented 5 years ago

And iterating over the parameters for remez, the rmse results are similar as for the generated arrival rates. Smaller coefficients have less error. but for 144 taps, there are convergence problems at lower coefficients c1 and c2. Still, the error improvements are not enormous.

Freicoin-RMSE36gen

still, for higher values of C1 and C2, the results can appear bizarre

Freicoin_arrival_2016_144_05

freeskaro commented 5 years ago

notice here, 72 taps and f=[0.0,0.001,0.01,0.5]. But the only significant advantage is it being less out of phase. Freicoin_arrival_2016_72

Okay. Next step will try to incorporate the Kraft method for estimating mean of nonlinear binomial distributions.

maaku commented 5 years ago

I'm happy that it's coming along. Sorry that I haven't been available to provide feedback. Hopefully that will change soon--I'm trying to get caught up on some other projects, and arranging travel. I haven't been able to work on tradecraft at all for a few weeks.

freeskaro commented 5 years ago

Playing with the Bessel filter a little. It seems to work better than PM. So far, I don't quite understand what the parameters mean. I've just been playing with numbers. So for generated constant rate and Freicoin data:

generated_arrival_bessel

Freicoin_arrival_2016_bessel

freeskaro commented 5 years ago

Here are RMSE error for a Bessel filter on generated arrival times: RMSE-bessel-gen

The minimum error at order =1 and pass filter at 0.5 is meaningless. Its just reproducing the noise. The best results seem to have different needs. For constant hash rate, one wants to arrive at a constant arrival rate. So 6th order and low pass band of 0.005: generated_arrival_bessel-6-005

However, these parameters have too long a delay and don't react quick enough for the real data: Freicoin_arrival_2016_bessel-6-005

Still, for what is supposed to be a constant hash rate, the estimated hash rate will vary by 2X!

freeskaro commented 5 years ago

Perhaps the overall best results using Bessel (in this unsophisticated parametric study) is bessel(6,0,05.'low'): generated_arrival_bessel_6_05

Freicoin_arrival_2016_bessel-6-05

But a Bessel looks better than a PM filter, over the range of parameters, the behavior seems more consistent and the predicted hash rates are lower and less variable.

freeskaro commented 5 years ago

Screenshot (279)

Just checking the numbers at current block height. Filter estimating hash rates ranging from 270 to 750 Ths. Median rate estimated at 438Ths, and freicoinpool.com shows 439.8 Ths.

maaku commented 5 years ago

Thanks for running the simulation again. I'd be interested to see the results in a day or two when the remaining transients have disappeared, assuming we don't see the multipool hopper again.

Sorry for being AFK for a while on other work. I'm back in the game again working on some critical path code for the v13 release.

freeskaro commented 5 years ago

I'm not really sure what the best thing to do next is.

maaku commented 5 years ago

Ok I’ll bump a review of this up in my priorities. I think there is more we can do, but I need to better understand what you’ve implemented first.

freeskaro commented 5 years ago

Freicoin_arrival_2019-08-07 updated

freeskaro commented 5 years ago

Freicoin_arrival_2019-08-07-B same thing just calculating hashrate further back

freeskaro commented 5 years ago

A summary of all this is (it's been a while I did the analysis so the details are a little fuzzy. I'm writing from memory) : 1) moving average or median based on few blocks (20 or 144) are alright but estimate quite a variable hashrate when it should be constant. They never calculate negative values 2) PM filter doesn't really produce much better hash rate estimates than mean or median. It can produce weird negative results. As far as error goes, there is a plateau where error is about the same for permissible values of the parameters. The error can be reduced, but it also nears the point where solution gets unstable. 3) Bessel filter is the best. It is the only method that estimated almost constant hash rate for generated arrival times. The best parameters are bessel(6, 0.05, 'low', analog=False) 4) Seems someone is injecting some serious hash power in Freicoin intermittently.

maaku commented 4 years ago

Issue #64 proposes switching to merge-mining even prior to the deployment of Forward Blocks. That gives this new difficulty adjustment algorithm a higher priority.

@freeskaro I'm sorry this has stagnated. It must be frustrating to have put in a bunch of effort and not seen any action. The effort you have done is relevant however, and I hope to pick this up soon. It has just taken way, way longer than I expected to deploy block-final transactions and update the segwit code for Freicoin/Tradecraft. (I'm still not finished with that in fact.) I had thought these were prerequisites to changing the proof of work & difficulty adjustment algorithm, but with #64 this might not be the case.

maaku commented 4 years ago

Update: Block-final transactions are deployed and activated. There are now no technical dependencies which block deployment of an updated difficulty adjustment algorithm, using a merge mining approach as discussed in #64. Nevertheless the time I have to devote to Freicoin development is currently being allocated to segwit. Hopefully that will wrap up soon...

maaku commented 4 years ago

And now segwit is released. My next priority for Tradecraft is knocking together a merge mining proposal, as discussed in #64, which includes a new difficulty adjustment algorithm. So in the next week I hope to get up to speed on the work @freeskaro did about a year ago (sorry it took so long!), and try to replicate his results in a simulator.

maaku commented 4 years ago

@freeskaro could you re-add me to your private repo? I'm afraid the invitation has expired.

freeskaro commented 4 years ago

I sent you an invite. I'm not really used to Github and developing with others and hadn't really uploaded my code. But I have added a few files now.

Also, I haven't looked at this stuff for while. So I have to get back up to speed. After that, it should be fun!

maaku commented 4 years ago

I finally got my simulator up and running and am getting results similar to yours.

Confession time: this whole time I thought Bessel functions could be used to generate a finite impulse response filter. I'm very hesitant to use an infinite impulse response filter because the current parameters cannot be verified without access to the entire history of block headers. Now upon reflection, the downsides of that could be mitigated by adding a commitment to the filter state in the block, so I suppose it is possible to use an IIR filter like Bessel, but it'd have to really be worth the added complexity.

I haven't yet rejigged my simulator to support IIR filters. I don't think any of your charts compare PM to Bessel head-to-head. Do you remember seeing a massive improvement, or was it just incrementally better?

freeskaro commented 4 years ago

I didn’t compare them head to head. I made separate plots of generated arrival times or block arrival time history against PM or Bessel.

I concluded that PM had a fatal flaw: it produced negative values for hash rate. It seemed analogous to ‘over fitting’ problem in regression. PM also had a delay. The Bessel really seemed better in every way.

maaku commented 4 years ago

Any filter is capable of generating negative values if you feed it enough garbage, and unfortunately 'garbage' reasonably describes the freicoin hash rate in the recent past. With a FIR low-pass filter this'll happen if there is an adjustment right after a block is found after days or weeks of zero blocks. With an IIR filter it might take a couple of these in a row to drive predictions negative, but the flip side is that predictions would stay negative for longer.

The intuitive understanding of this is that a low-pass filter is designed to heavily weight against sudden hash rate changes, with frequency defined per-block rather than per-unit-time. If you're cruising along at a good speed and then suddenly the hash rate drops to zero for an extended period (no blocks), then the very next block will indicate a sudden change ('sudden' because time only ticks forward when we find a block) and given a negative weight. If the duration between blocks is long enough, that negative weight will overpower all the other contributions.

That said, the filter freicoin uses right now is very poorly chosen and is unusually susceptible to this. It's not just one block at the front but like the first 33 blocks which have negative weights so it is super easy to fall into this negative hash rate prediction trap:

Freicoin-144-tap-PM

This is also why, for example, if there is a sudden hash rate change the filter reacts in the opposite direction for the first half dozen or so adjustments. These are really poorly chosen parameters, and I apologize. We can do better.

Here are the weights for the best candidate filter that I've found. Like the one I presented in Milan, it is 36 taps instead of 144. My current simulations show best results with adjustment every 7 blocks instead of 9 (this also protects against a variant of the time warping attack):

remez,n=36

Notice that there are a lot more ripples, the very first coefficients are positive, and there isn't a big opposite response in the beginning. It's much harder to drive predicted hash rate negative with this filter, and the delay is greatly minimized compared with the 144-tap filter that is currently used.

Lastly I'll note that a temporary negative hashrate prediction in response to a transient block duration anomaly isn't really an issue. That's why there is a limiter set on the total adjustment. A negative prediction will just cause the hash rate to adjust downwards by the max amount, which will probably be undone by the next adjustment 7 blocks later. It is annoying that the estimated network hash rate is negative, but we can probably fix that directly by sorting block times before feeding it into the predictor for that RPC, to give more reliable results (though that means it won't match up with the actual adjustment).

maaku commented 4 years ago

While I'm at it, here's some more charts about that candidate filter. This is the frequency response curve, showing the sharp cutoff at the critical frequency, and minimal ripples in the low-pass region (even less noise is possible by opening up the band gap a bit, but those didn't perform as well as this one which aims for a sharper cutoff):

response,n=36

Here is its response to a square wave impulse (blocks being found exactly on expectation, with hash rate increasing by 10x at once):

impulse,n=36

Note that it is near-perfectly damped. Here is a more realistic simulation of the same 10x increase, then decrease in hash rate, but this time with random block finding:

simulate,n=36

maaku commented 4 years ago

I’m reconfiguring my simulator to support bessel filters so we can have a head-to-head comparison.

maaku commented 4 years ago

Preliminary results are that Bessel performs about 20% better by the RMSE metric in a head-to-head comparison, and has the added benefit of adjusting every block without adding noise, which solves some adversarial concerns I have about FIR filters.

I have some ideas for how we can commit to the filter results as part of the merge mining data structure, which would remove my main technical concern about using Bessel filters.

The current best candidate filter is bessel(11, 0.5, 'low') with a gain of 0.03125 and a limiter of 1.0546875 and adjusting every block. The parameters were selected by:

  1. n=11 because this uses the results of the last 12 blocks (11 differences), which are needed anyway for the MedianPastTime check;
  2. cutoff frequency of 0.5 is the same as 0.25 in scipy.remez, which we already found to be an optimal setting both from theoretical considerations and simulation;
  3. L=1.054688 is 135/128, which is the largest limiter which is relatively easy to calculate with bitshift addition but which prevents an adjustment of more than 2x over 12 blocks, which is the largest adjustment we want to allow for safety reasons, and a constraint added by the protocol cleanup code;
  4. G=0.03125 achieves critical damping with the above selections.

Here's a simulated response to a 10x square wave impulse of added hash power:

optimalbessel The smooth orange curve is the response to perfect input, where blocks are found on time, exactly in expectation. It very closely reflects the square-wave input (which starts at t=7 and ends at t=14). The jagged blue curve is a realistic simulation where blocks are found at random according to an exponential distribution.

maaku commented 4 years ago

n=6, which @freeskaro identified, is also a good candidate. It performs marginally better in the benchmark scores, but the scoring doesn't include adversarial considerations. My intuition is that a longer filter will require an adversarial miner to manipulate the timestamps of more blocks in order to drive the difficulty off its true value. I've analyzed how this works for FIR filters, but an IIR filter like bessel is a bit different.. I need to think on this one to see if a shorter filter of order 6 is okay.

maaku commented 4 years ago

Having looked at the filter coefficients for each, I think my concern over smaller filter windows may not apply for the Bessel filter. Both n=11 and n=5 end up being about the same in what an adversary can accomplish. I’m leaning towards the smaller filter now since it scores better.

I’m going to delay making a decision until the merge mining code is ready, which should be a few more weeks. That gives us some more time to weigh the trade offs.

maaku commented 4 years ago

@freeskaro I just want to let you know that I've created a pull-request to add merge mining. It does NOT have the updated difficulty adjustment algorithm, but that is the next thing to be implemented.

maaku commented 4 years ago

So having done some more simulations and looked at the filter parameters themselves, the new candidate filter is bessel(2, 0.5, 'low') with a gain of G=0.015625 and a limiter L=1.0625. This new filter selection is the result of micro-optimizations which affect the implementation, and my increasing comfortability with shorter filters in an adversarial situation. Here's a simulation of its effectiveness against a 100x step-function impulse of added hash power:

Bessel,n=2

The gain (= 1/64) and limiter (= 17/16) are chosen because they are easy to calculate in bignum arithmetic, are simple fractions so as to yield maximum bits of precision to the quantized filter instead, and perform the best in simulation in terms of both RMSE under the curve and expected block times.

bessel(2, 0.5, 'low') has some interesting properties. It adjusts based on the prior 3 inter-block intervals, which means only 4 block headers are required to SPV-prove the correctness of a block's difficulty adjustment. With merge mining, block headers are considerably larger, so reducing the number of headers required is critically important for making compact block connectivity proofs possible. Attempting N=1 results in a degenerate filter that merely averages the past two block intervals, and doesn't work very well.

bessel(2, 0.5, 'low') also only uses filter value from two adjustments back for its feed-forward value. It technically accepts two feed-forward values, but the weight for the first is effectively zero (and becomes zero when the filter is quantized). This makes it easier to calculate, and maximizes the delay in benefit against an adversarial miner which manipulates timestamps.

I was originally hesitant to consider a N=2 Bessel filter because an adversarial miner would only have to mine / withhold a couple of blocks to gain control of the adjustment algorithm, which is quite doable. However what protects this from being an issue is something we need to do anyway: set a low limiter value so the difficulty can only adjust so much at once. With L=1.0625, it takes 12 blocks of continuous maximum adjustment for the difficulty to halve or double. This is already our design requirement for the protocol-cleanup hard-fork removal of the difficulty adjustment rules, since it would require the equivalent of about 9 blocks of work, done in secret, to get a meaningful adjustment of difficulty. In short, we don't have to have a large filter to reduce effectiveness of selfish mining, just a small limiter.

And, critically, N=2 Bessel filters perform better than higher-order alternatives. Surprisingly this holds true in both the artificial step-function simulation, and with simulations based on Freicoin and Bitcoin historical hash rates. Its advantage over larger filters is small, a few percentage points at best, but the smaller filter is also easier to calculate and requires fewer block headers to prove.

Unless anyone can point out a security issue in using a smaller filter, I'm going to set about implementing this one.

freeskaro commented 4 years ago

Hi Mark,

Great news. Nice work you've done.

Regarding the merge mining, if I want to spread the word around the telegram channel and elsewhere, what should I say?