zawy12 / difficulty-algorithms

See the Issues for difficulty algorithms
MIT License
108 stars 25 forks source link

EMA for BCH (part 2) #62

Open zawy12 opened 4 years ago

zawy12 commented 4 years ago

In a previous issue I discussed this in a different way, received feedback from Mark L, and discussed the possibility of RTT for BCH. This issue discusses everything that came up in a live stream and everything I know of surrounding changing BCH's difficulty algorithm (DA).

My process to find the "best algo" is this: find a window size parameter "N" for each algo that gives them the same standard deviation for constant HR. This indicates they accidentally attract on-off mining at the same rate. Then simulate on-off mining with step response that turns on and off based on a low and high difficulty value. I have "% delayed" and "% blocks stolen" metrics. Delay: add up a delay qty that is (st - 4xT) for each block over 4xT. Divide that by total blocks to get a % delayed. For "% stolen" I divide time-weighted target that the on-off miner sees by that of dedicated miners. It's how much easier his target was, while he was mining. I just now re-tested 5 DAs. Except for SMA-144, the other 4 below are pretty much the same. WT-190 has slightly lower % stolen but slightly higher delays. (WT-144 is what I kept mistakenly calling wtema in the video). Delays and % stolen are usually a tradeoff. I had been penalizing WT-144 in the past by giving it the same N as LWMA when it needs a larger value.

SMA-144 WT-190 (as opposed to wt-144) LWMA -144 ASERT / EMA - 72 [ update: See my 2nd comment below. In Jonathan's code, this value would be called 72 * ln(2) ]

I'll advocate the following for BCH in order of importance.

There's also sequential timestamps which might be the hardest to do in reality, but important on a theoretical level. It might be possible to enforce sequential timestamps by making MTP=1 instead of 11. Maybe this could prevent bricking of equipment. Sequential stamps (+1 second) must then be able to override local time & FTL. Overriding local time to keep "messages" (blocks) ordered messages is a requirement for all distributed consensus mechanisms. MTP does this indirectly.

Antony Zegers did some EMA code, using bit-shift for division.

EMA is a close enough approximation to ASERT, but here's Mark's tweet on how to do accurate integer-based e^x. https://twitter.com/MarkLundeberg/status/1191831127306031104?s=20

wtema simplifies to:
target = prev_target (1 + st/T/N - 1/N) where N is called the "mean lifetime" in blocks if it were expressed as ASERT, target = prev_target e^(st/T/N - 1/N) The other form of ASERT should give the same results, but the other form of EMA could return a negative difficulty with a large st. [ wtema-144 and asert-144 refer to N = 144 / ln(2) = 208.

If wtema is used: Negative solvetimes (st) are a potential problem for wtema if N is small and delays are long and attacker sends an MTP timestamp. See wt-144 code for how to block them. To be sure to close the exploit, you have to go back 11 blocks. Do not use anything like if (st < 0) { st = 0 } in the DA which allows a small attacker to send difficulty for everyone very low in a few blocks by simply sending timestamps at the MTP.

Std Dev of D per N blocks under constant HR for EMA/ASERT is ~1/SQRT(2*N). This is important for deciding a value for N. If HR increases 10x based on 10% drop in D, you definitely want N > 50 or the DA is motivating on-off mining. On the other hand, being at the edge of motivating on-off mining can reduce the expected variation in D. BTG's Std Dev is about 0.10 when it would have been 0.15 if HR was constant (N=45 LWMA, N=22.5 in ASERT terms). So it's paying a little to on-off miners to keep D more constant. ASERT with N=72 is same as SMA=144 in terms of D variation when HR is constant.

On the other side is wanting a sufficiently fast response to price changes. As a starting point, for ASERT N=100 it takes 6 blocks to rise 5% in response to a 2x increase in HR. This seems sufficiently fast. Std Dev is 1/SQRT(2*100) = 7% per 100 blocks which seems high. 5% accidental changes in D seems to motivate substantial HR changes, so maybe N=200 is better which will have 5% variation per 200 blocks. But this is like LWMA N=400 which would be incredibly smooth and nice if it works but this is 3x larger than my experience. BTG is very happy with LWMA N=45 with Std Dev 10%. Notice that there's bigger bang for the buck with lower N due to the 1/SQRT(N) verses N. You get more speed than you lose stability as you go to lower N. So instead of 200, I would go with 100 or 150.

None of the 60 or so LWMA coins got permanently stuck. About 5% could not get rid of bothersome oscillations. When sending the same random numbers to generate solvetimes, LWMA / EMA / ASERT are almost indistinguishable, so I do not expect them to get stuck either. Most LWMA's are N=60 which has the stability of EMA with N=30. N=100 would have probably been a lot nicer for them, if not N=200 (ASERT N=100). They are mostly T = 120 coins, so they have very fast response, reducing the chances of getting stuck.

Mark was concerned about forwarded timestamps sending difficulty very low which could allow a small attacker to do a spam attack (he would begin with setting a far-forwarded stamp for EMA/ASERT for submission later). To make this a lot harder to so, use st = min(st, 7*600). So long delays will not drop the difficulty as much as "it wants to". Most LWMA's do this as an attempt to reduce oscillations but it may not have helped any.

Combining upper and lower limits on difficulty, timespan, or solvetimes like the 4x and 1/4 in BTC, and the 2 and 1/2 in BCH is what allows unlimited blocks in < 3x the difficulty window. I'm the only one I know of that has described it. A couple of coins following me were the first to see it, losing 5,000 blocks in a couple of hours because I had the hole in LWMA but did not realize it. Some LWMA's had symmetrical limits on st, not just the 7xT upper limit. Removing the 1/2 limit on timespan in BCH will prevent it.

If FTL is dropped, the 70 minute revert from peer to local time rule should be removed (best) or set it to FTL/2 or less. It should be coded as function of FTL. There are a couple of exploits due to Sybil or eclipse attacks on peer time.

A pre-requisite for BFT is that all nodes have a reliable clock that has higher security than the BFT. The clock in POW is each node operator knowing what time it is without needing to consult the network. Any node can unilaterally reject a chain if its time is too far in the future so it's impervious to 99.99999% attacks. (Although current mining on honest chain needs to be > 50% HR of the forwarded cheating chain for the node to continue to reject the cheating chain because time going forward can make the cheating chain become valid.)

jacob-eliosoff commented 4 years ago

@jtoomim, I agree in terms of strict work (aside perhaps from if there are any little bugs in how work is calculated, like the (N-1)/N @zawy12 alluded to). But there is a complicating factor, which is the block rewards. If I can mine 1,000 blocks with the same total amount of work expenditure as the main chain takes to mine 100 blocks, I have 900 extra block rewards. So there may be attacks like:

  1. I own an amount of hashrate equal to 90% * BCH hashrate
  2. I spend 700 block rewards to rent enough extra hashrate to push me up to 101% * BCH hashrate
  3. I mine my 1,000-block secret chain, outworking the main chain

Now:

So my attack's expected net profit is 300 - 90 = 210 block rewards. (Of course this leaves out impact of the attack on coin price, which is a significant defense against many attacks, but we should make our system robust even under fixed price when we can. It also leaves out the benefit of making my competitors waste hashrate on their 100 orphaned main-chain blocks.)

(EDIT: to clarify, my "There are no shortcuts" argument above was meant to show that asert prevents attacks like this, because its time-based difficulty ensures you can't actually mine 1,000 secret-chain blocks for the same work as 100 main-chain blocks. But this does depend on the DAA (and/or the timestamp-constraint rules): eg, if miners/nodes accepted blocks with bogus timestamps weeks in the future, you could mine thousands of tiny-difficulty blocks adding up to 1% more work, but many more block rewards, than the main chain. In short, I think asert's robustness against attacks like these is yet another argument in its favor.)

jacob-eliosoff commented 4 years ago

https://gitlab.com/jtoomim/bitcoin-cash-node/-/commit/fd92035c2e8d16360fb3e314b626bf52f2a2be67

Couple of super minor things:

  1. Not sure how cute you want to get, but you can reduce six multiplications here:
    uint64_t factor = (195766423245049*exponent + 
                       971821376*exponent*exponent + 
                       5127*exponent*exponent*exponent + (1ll<<47))>>(rbits*3);

to three here:

    uint64_t factor = (((5127 * exponent + 971821376) * exponent + 195766423245049) * exponent + (1<<47)) >> (rbits*3);
  1. Might be a little cleaner to write nextTarget = nextTarget >> -shifts as nextTarget >>= -shifts. (Etc)

Otherwise, I haven't checked it in detail, but I did check that your approximation as coded here stays within <0.01% of your exact formula as coded here, at least for a bunch of typical cases. (Fwiw if nTimeDiff is around -5000000 the results get pretty wonky but that seems quite extreme.)

jtoomim commented 4 years ago

Not sure how cute you want to get, but you can reduce six multiplications here:

I consider readability to be more important than execution time.

Might be a little cleaner to write ... nextTarget >>= -shifts

Yes, a bit.

zawy12 commented 4 years ago

it's impossible for your branch to be both at a greater height (ie, for you to have mined more blocks), and lower difficulty, than the main branch.

To be clear, another condition is that this assumes a specific timestamp at that height.

(EDIT: to clarify, my "There are no shortcuts" argument above was meant to show that asert prevents attacks like this, because its time-based difficulty ensures you can't actually mine 1,000 secret-chain blocks for the same work as 100 main-chain blocks. But this does depend on the DAA (and/or the timestamp-constraint rules): eg, if miners/nodes accepted blocks with bogus timestamps weeks in the future, you could mine thousands of tiny-difficulty blocks adding up to 1% more work, but many more block rewards, than the main chain. In short, I think asert's robustness against attacks like these is yet another argument in its favor.).)

We only know what D will be at a certain height + time combination. The path of individual solvetimes leading up to that point can be different amounts of work and release more or fewer blocks even if the timestamp restrictions are ideal. In expanding the recursive math for ASERT, I can only see that symmetrical st/D changes around an average st/D (constant HR) will cancel. The imperial college section 4.5 implies "no alternate" paths is true if HR is constant, but it is couched by the phrase "Assuming the DA is working correctly..."

Experimenting today using ASERT-288 with ideal timestamp restrictions in a 51% selfish mining attack, there seems to be a problem. 1st st: 500 T - 686 next 686 st's: 1 second. The average D of these 687 blocks is 0.729 of the public chain. It takes 500*T time and the total chain work is 500.1 blocks at the avg public difficulty so it wins chain work. So attacker gets 687 blocks in 500 T time (37.5% excess blocks). This is 138/(50-25) = 5.5x profit for selfish mining if his cost basis is 50% of his hashrate. In brief testing, 37.5% may be the max for all ASERTs, possibly having something to do with 1/e = 37.8%.

The other DA's that require more than 1 previous target do not seem to have this problem. They all seem to rapidly increase D too fast. Except for timestamp restriction errors, I've never seen anything in LWMA, Digishield, SMA, or BTC that can help a miner get an excess of blocks. (Setting aside minor differences that may be in the recent selfish mining paper.)

Clipping how much D can drop in 1 block is the only way I see to prevent this. RTT may prevent it. Clipping the drops in D is the safest "good intention" I know of because the potential problems are known: decreased emission rate and a stuck coin.

This is the only timestamp attack where good timestamp handling does not prevent it. I mentioned above that RTT is theoretically required for distributed consensus in order to properly estimate current HR (the voting population). Maybe this exploit is a result of not following that requirement. (But I do not want to push RTT)

If HR is always increasing faster than the DA can respond, all the algos release too many blocks. Again, RTT may prevent that.

jtoomim commented 4 years ago

Note: Most miners and exchanges on BCH follow a 10 block finalization rule. Once a block is buried 10 blocks deep into a chain, a node will refuse to perform any reorg that removes it from the active chain unless the operator manually intervenes with a bitcoin-cli reconsiderblock [hash] command. This finalization rule is present in Bitcoin ABC and BCHN, but AFAIK not in Bitcoin Unlimited, Bitcoin Verde, or Flowee. This finalization rule was added in response to the BSV reorg threats, and should be enough to prevent this attack.

jacob-eliosoff commented 4 years ago

True. That rule is sad though (I do understand why it was introduced) and opens up some other attacks, and opportunities for chain splits/consensus failures. I hope someday it can be safely removed. But yes until then these long-secret-chain attacks are theoretical for BCH.

jtoomim commented 4 years ago

@zawy12 37.5% extra profitability for a 3.47-day 101% attack isn't a very good ROI.

(The attack you described is a 101% attack. When the attacker leaves the main chain to do their attack, the difficulty will fall slightly, and attract more hashrate from BTC.)

A 51% attack (true 51%) is far more profitable. It can maintain 99% extra profitability indefinitely, as long as opponent miners are rationally programmed to take their orphan rates into account when deciding which chain to mine. By simply orphaning any blocks mined by opponent miners, a 51% selfish miner can get 100% of the block rewards while maintaining difficulty at 51% of equilibrium.

If opponent miners do not behave rationally, then the selfish miner can force them to change their behavior by maintaining difficulty at 99% of equilibrium while orphaning some or all opponent blocks until those miners change their code.

In a scenario in which BCH is the majority coin, this question of rational behavior is irrelevant, as there's no risk of additional hashrate appearing in response to higher perceived (but not actual, due to orphans) profitability.

jtoomim commented 4 years ago

If you have enough hashrate to perform a 3.47-day 101% attack reorg, you're better off attacking exchanges than attacking mining. If you attack exchanges, you could deposit $10 million worth of BCH, convert it into fiat or another cryptocurrency, and move it into your account within 3 hours. You can then publish your reorg and get your $10 million of BCH back. In the worst case scenario, the BCH/USD ratio is down to 0, and your original BCH now has a value of $0, and the blocks you mined have a value of $0, which means you had a net loss equal to the amount you spent on hashrate (~$25k at current prices). But if BCH retains even a small fraction of its original value (e.g. ≥0.25% for the stated numbers), there's a net profit to be had. And if the BCH price doesn't fall at all, that profit will be $10 million, or 40000% as much as you spent on hashrate, or 99.75% as much as you anted up with both hashrate and the $10 million deposit.

You can potentially do even better than this attack if you can find an entity that is willing to exchange UTXOs with you in an independent manner. If you send them UTXO A in exchange for equal-valued B, you could perform a reorg and revert the spend of A but include B. You can also chain and repeat these spends: you could then take the change from the A spend tx, and spend B+change(A) in exchange for C. Then spend C+change(B), etc. If your counterparty never includes a transaction that depends on the spend of A, then by reorging the chain and blacklisting the original A spend, you can ultimately be left with A, B, C, D, E, and every other coin that was briefly in your possession.

Tl;dr: "But it can be profitably 51% or 101% attacked!" is not a good argument against a DAA. Bitcoin can be profitably 51% attacked. Performing 51% attacks for mining profit is simply outside the security model.

zawy12 commented 4 years ago

I am assuming the attacker was not previously mining BCH and came on with slightly 101%, so he now makes up 51% of the total HR (but my calculation on his profit is wrong like you said). I believe you're saying a true 51% makes it unprofitable for the other 49% by orphaning so that difficulty drops to about 1/2, so attacker get's 100% instead of 50%. But if difficulty drops just 5%, there will be > 3x more HR coming online in the case of BCH, so he will lose everything but a portion of the 5%. If my 3.5 day attack were possible on BCH (without the 10-block rule) and repeated more than once, it should be the same as I think you're describing, theoretically driving away the other 49% to get a true 100% like I calculated, except too much HR will come online to prevent that. In my scenario, the ending difficulty is high, so it's not the same, but would cause oscillations.

I agree 37% like this is not a hardly a problem, even if the 10-block rule did not prevent it. A similar problem exists in Zcash and ETH, not to mention the unlimited blocks problem in most coins. But I wanted to show there actually is a shortcut for selfish miners in ASERT that doesn't seem to exist in the other algorithms.

jtoomim commented 4 years ago

Yes, it's worth being aware of.

I don't think it merits the term "selfish mining" though. That term should be reserved for <51% hashrate techniques.

zawy12 commented 4 years ago

Is that because Emin et al's introduced the term with their first paper? What do we call > 51% attacks? "Selfish" always seemed like a bad word for any of it because of the connotation. Private mining, block withholding, or majority mining would have been better. Are there examples of where < 51% attacks have occurred? There's a kind of a logical inconsistency in the theory: if miners see a 25% or 33% attack, then they should form a slightly larger group to get the extra profit themselves. The back and forth incentives should lead to >51% attacks. I've only seen > 51% attacks.

jtoomim commented 4 years ago

I hope someday it can be safely removed.

If we want to protect against the future-stamped parallel chain attack, there are some easy low-risk ways to do so that can replace the 10-block finalization rule. For example, when evaluating whether to reorg to an alternate chain, we could use chainwork*(chainwork/miner_income)^n instead of raw chainwork as the decision metric, with an e.g. 0.5 block hysteresis threshold (don't switch unless the newer chain has a > 0.5 block advantage). This should broadly discourage selfish mining strategies without affecting normal mining, since (a) normal mining doesn't face reorg rules except during rare orphan races whereas selfish mining strategies do, and (b) selfish mining strategies raison d'etre is decreasing the chainwork/income ratio.

In the case of the 37.5% version of the future-stamped parallel chain attack attack @zawy12 described, this would mean that in order to gain 37.5% extra profitability, the attacker would need to do 37.6% more chainwork by the same clock time as the honest chain, which they should not be able to do: adding more blocks within the same timestamp limit would increase the diff, and reduce the profit. Requiring chainwork*(chainwork/miner_income)^1 would limit the profit to something like 37.5%/2 while requiring 100% + 37.6%/2 as much hashrate. (Those numbers are just a guess; I haven't done the simulations to verify.)

Alternately, rather than proportional penalization, we could block this attack by forbidding a reorg to a chain that (a) exceeds 24 hours of timestamp length since the most recent common ancestor, and (b) that skipped forward more than e.g. 50% of that interval in the first e.g. 10% of blocks.

Or we could do what I prefer, which is to do nothing about this attack as long as other more serious 51% attack types exist.

jtoomim commented 4 years ago

Are there examples of where < 51% attacks have occurred?

Yes, but AFAIK only by accident. Whenever orphan rates get high naturally, selfish mining "attacks" are automatically performed. This was significant on p2pool for a long time. Because p2pool used a "share" blockchain that was parallel to the main blockchain and containing all transactions but with a shorter share/block interval than the main chain, it used to have much higher orphan rates. Large mining nodes on p2pool would consistently get 3% or better profitability advantages over smaller nodes. We eventually fixed that by improving performance and reducing share times.

We've also seen similar things on BTC in 2014-2015, when block sizes were getting above 500 kB but before we had widely deployed fast block propagation techniques.

"Selfish" always seemed like a bad word for any of it because of the connotation. Private mining, block withholding, or majority mining would have been better.

It's selfish because it modestly improves the revenue for the attacker at the expense of bystanders. But it's not an outright rewriting of history the way that 51% attacks are. Block withholding is also a decent term. Private mining is less clear, as it can also be interpreted as non-pooled mining rather than non-published mining. Majority mining is inaccurate, as it's specifically less than a majority (<51%, typically 30-40%), and not even necessarily a plurality. A 35% miner can effectively mine selfishly even in the presence of an honest unsophisticated 40% miner.

zawy12 commented 4 years ago

The best 10-block attack I could do with the current timestamp restrictions with ASERT-288 was 5% lower difficulty, beginning with a +(12+9)*T stamp followed by 9 blocks with +0 timestamps. It's block withholding so the (12+9)*T stamp is allowed at the end. Attacker with 101% HR finishes his chain 5% earlier and submits it. Since he got 10 of the last 11 blocks at timestamp (12+9)*T, public chain can't send an honest timestamp all the way back to where it should be (about -12*T at 11th block) but are stuck at +1 second solvetimes (his last 4 or 5 are actually +1 second solves). He can get 5 of the next 10 blocks for avg 2% less D. Then he goes back to BTC as difficulty begins to rise fro his HR.

we could use chainwork*(chainwork/miner_income)^n instead of raw chainwork

I assume we're discussing these things "to leave no stone unturned" as I think this is only needed for the 37.5%, which I don't think is a problem. The 5% attack needs to be reduced at least by lowering the FTL = 7200.

This chain work fix might allow a double-spending attack where the attacker pays more to mine to rewrite the chain. Devs needing to choose "n" without a direct requirement from theory is a red flag. But the idea is interesting (let the least profitable chain win). It could revive my idea of basing difficulty on time instead of blocks. But it seems wrong in this instance. The DA's job is to keep emission (blocks) on schedule. This fix seems to be doing the same but is outside the DA. It might be handled better inside the DA. A clip on drops would be an easier fix for both the 5% and 37%.

I have had a similar idea for securing alts by letting the tip with the highest chainwork/miner_income win. Honest miners use their fees to buy electricity and txs on BTC to notarize (timestamp) their block headers. The next miner on the alt chain includes the BTC header and Merkle tree data to the prior block's notarization to import BTC's work directly onto the alt chain. Nodes only need sha256 to check the BTC work directly (only miners need access to BTC nodes). The alt's chain work is literally BTC's chain work without BTC, in exchange for the BTC tx fee. Chain work is the only rule but it indirectly enforces highest chainwork/miner_net_income wins. Big holders would want to mine simply to secure their holdings. The difficulty to get reward (new coin generation as opposed to fees) would be from "self-hashing txs" that are set to BTC's difficulty/reward so that the alt is automatically pegged to BTC value. Unfortunately, BTC fees are not in the header.

jtoomim commented 4 years ago

Also worth noting: The 37.5% and 5% profitability scenarios that @zawy12 mentioned both get higher profitability via an externality: it ends with higher difficulty for the attack chain than for the original chain. This gives a strong incentive for honest rational miners to not reorg to the attack chain despite it having slightly higher total work done, and encourages miners to discount the attack chain somehow. Perhaps chainwork_delta / end_difficultycould be a good heuristic for evaluating whether to reorg? That would make miners prefer chains that had most of their chainwork early on and at lower profitability, and which end with high profitability for anyone who wishes to switch to it.

zawy12 commented 4 years ago

both get higher profitability via an externality:

I'm glad you put it into words. It indicates ASERT may not be making a mistake. The 5% resorts to a different externality: it ends on a time at the FTL. The difficulty is slightly lower. The best way to reduce in the 5% would be to tighten the FTL. With FTL at 600 seconds there is still a 1.9% gain possible (most of it is not due to FTL>0, but the 37% effect). The 37% attack is exploiting a Poisson assumption in ASERT. In testing clipping just now, even if the clip were a tight 6*T, the attacker only needs to spread out the long initial solvetime, costing him only a little. So clipping is no good. BTW 30% gain is possible in 1.5 days and 20% in 0.75 days (ASERT-288 wth FTL=7200). 17% in 0.83 days with FTL = 0.

In the 10-block attack I'm doubtful he can even get the 5% (or 1.9%) excess on average due to risking 9 blocks of effort during the large percentage of times public-chain luck will be greater than his 5% advantage in time. So Emin's type of selfish mining trickery is more relevant like you said.

I could not find any benefit to attackers with reverse times.

The attack starts with faking a really low HR for 1 or a few blocks, then faking a really high HR for many.

I thought about modifying your previous idea to 2 rules: miner should not switch unless the alternate tip has higher chain work and higher chainwork/miner_income (as opposed to multiplying them). But this is still a modification to the strict "highest chain work wins" rule. (BTW an important way to state this rule that should be exactly equal (if chain work is calculated correctly) is "highest time-weighted HR times total time" should win. ) There is probably no way to avoid this single rule even by my method of making an additional requirement. The job of the DA is sort of to enforce things so the chain work measurement is as accurate as it can be. If it can't do it better, then maybe nothing external to the DA can help (tight timestamps assist DA). In my 2-rule case, the problem is that a naturally higher HR will be blocked by the 2nd rule because the DA will lag in increasing difficulty, making chainwork/miner_income lower when it should be. Any work-around seems to require some heuristic that is going to have a problem or exploit.

It seems like the tip switching decision should be delayed to the 3rd block after first seeing the alternate tip. For example in the 10-block rule, before switching at block 10 the alternate tip must have been present at block 7. It's not much of a modification to what the DA is proclaiming but a way make the measurement more accurate. It might have a problem of causing too many races that are more likely to need to go past the 10 block limit. So maybe don't apply it until the 7th block.

chainwork_delta / end_difficulty seems like it would block natural changes in HR or allow an exploit.

I do not see any good repair except for tightening FTL (for the 10-block case) and the 3-block rule. For more ideas on how to handle the 1-day attack, I would need to see why LWMA / WT can be so close to ASERT and yet not have the problem. Apparently they rise faster for short solve times. Any repair should be much better if it's inside the DA instead of looking at chainwork. But a repair will likely be like switching at least some towards LWMA/WT. Switching to LWMA/WT requires no negative solvetimes as they make it rise really fast which can be exploited.

jtoomim commented 4 years ago

It would be interesting to see similar theoretical treament of strategies for cw-144. Of course, these theoretical numbers (e.g. 1.9% and 5%) are smaller than the actual real-world gains being observed on mainnet by miners due to the difficulty oscillations.

zawy12 commented 4 years ago

CW-144: It looks like 50% using initial st = 96*T then st=0 for 144 blocks so it's 144 blocks in 96 time. I'm surprised I had not noticed this before. As with ASERT, clipping only helps some, maybe down to as low as 35%. The current 10-block potential with timestamp manipulation is 10%.

I think I had estimated the current on-off mining was getting only 5% to 10% less. But someone targeting only the lowest D's could get more revenue/time.

ASERT-144 and LWMA N=60 also allow 10% in the 10-block attack. I still can't find a vulnerability in LWMA for the long range attack. It rises too quickly compared to how much it allows the single block to lower it.

EMA seems to be working same as ASERT in the attacks. I made a mistake: ASERT also loses 10% on the 10-block attack.

zawy12 commented 4 years ago

I made a mistake: ASERT also has 10% loss in the 10-block 100% attack.

I do not have any real new information in this post but wanted to show I'm exploring things. I support ASERT N=288 with no clipping, sequential timestamp consensus (or up minus 1 minute consensus (not in DA) like Jacob suggested) and FTL = 300.

Interestingly, WT is EMA with an adjustment towards an SMA calculation if the previous WT calculated target deviates from an SMA calculation, something I was thinking about to fix the 37% (but I won't pursue it). It is an EMA biased towards an SMA.

T = target B = Block time t = solvetime

EMA: T[i+1] = T[i] * { 1 + t[i]/B/M - 1/M } M = tau/B

WT: replace 1/M above with (SMA[i]/T[i]) / M where the SMA[i] is what a time-weighted-targets SMA for N=2*M-1 blocks calculates for T[i]. This version of WT has to first have the correct initial conditions.

Update: the previous testing that was here included WT but the result may have had some error. This update compares ASERT-288 (ASERT with half life of 288 which is with parameter N=288/ln(2) = 416 in the equations) to ASERT, EMA, LWMA, SMA, and GRIN with "half lives" of half that much. In other words, ASERT N=208 is 2x the speed of "ASERT-288 (half life)" and is comparable to EMA N=208, LWMA N=416, SMA N=416, and GRIN with N=139 and damp factor 3. (3 * 139 = 416) . 1 million blocks each run (1 to 2 seconds on ASERT/EMA and ~4 seconds Grin/LWMA on my 2012 desktop CPU i5-750)

DA N attack size / start / stop % stolen % confirm delays
. ------ Attack Size:2.5 ------
. ------ Attack Start: 0.95 ------
ASERT2_ 416 2.5, 0.95, 1.05 1.93 1.4
ASERT2_ 208 2.5, 0.95, 1.05 3.56 2.3
EMA_ 208 2.5, 0.95, 1.05 3.65 2.5
LWMA 416 2.5, 0.95, 1.05 3.50 2.2
GRIN 139 2.5, 0.95, 1.05 4.67 2.6
SMA 416 2.5, 0.95, 1.05 4.74 2.7
. ------ Attack Start: 105 ------
ASERT 416 2.5, 1.05, 1.15 0.61 5.9
ASERT 208 2.5, 1.05, 1.15 1.65 6.5
WTEMA 208 2.5, 1.05, 1.15 1.67 6.7
LWMA 416 2.5, 1.05, 1.15 1.64 6.6
GRIN 139 2.5, 1.05, 1.15 4.2 7.9
SMA 416 2.5, 1.05, 1.15 5.7 8.5
. ------ Attack Start: 115 ------
ASERT 416 2.5, 1.15, 1.25 0.44 10.5
ASERT 208 2.5, 1.15, 1.25 1.24 11.1
WTEMA 208 2.5, 1.15, 1.25 1.25 11.3
LWMA 416 2.5, 1.15, 1.25 1.26 11.2
GRIN 139 2.5, 1.15, 1.25 4.42 12.7
SMA 416 2.5, 1.15, 1.25 5.92 13.1
. ------ Attack Start: 125 ------
ASERT 416 2.5, 1.25, 1.35 0.41 14.6
ASERT 208 2.5, 1.25, 1.35 1.08 15
WTEMA 208 2.5, 1.25, 1.35 1.08 15.2
LWMA 416 2.5, 1.25, 1.35 1.08 15.1
GRIN 139 2.5, 1.25, 1.35 4.56 16.7
SMA 416 2.5, 1.25, 1.35 5.55 17.1
. ------ Attack Start: 135 ------
ASERT 416 2.5, 1.35, 1.45 0.42 17.6
ASERT 208 2.5, 1.35, 1.45 0.99 18.2
WTEMA 208 2.5, 1.35, 1.45 0.98 18.5
LWMA 416 2.5, 1.35, 1.45 0.98 18.2
GRIN 139 2.5, 1.35, 1.45 4.44 19.9
SMA 416 2.5, 1.35, 1.45 5.22 20.4
. ------ Attack Size:500 ------
. ------ Attack Start: 95 ------
ASERT 416 5, 0.95, 1.05 1.32 1.6
ASERT 208 5, 0.95, 1.05 3.24 2.8
WTEMA 208 5, 0.95, 1.05 3.06 2.9
LWMA 416 5, 0.95, 1.05 2.98 2.7
GRIN 139 5, 0.95, 1.05 4.4 3.6
SMA 416 5, 0.95, 1.05 4.31 3.6
. ------ Attack Start: 105 ------
ASERT 416 5, 1.05, 1.15 0.43 8.2
ASERT 208 5, 1.05, 1.15 1.26 9.2
WTEMA 208 5, 1.05, 1.15 1.38 9.3
LWMA 416 5, 1.05, 1.15 1.38 9.2
GRIN 139 5, 1.05, 1.15 4.21 11.1
SMA 416 5, 1.05, 1.15 5.93 11.8
. ------ Attack Start: 115 ------
ASERT 416 5, 1.15, 1.25 0.29 15.6
ASERT 208 5, 1.15, 1.25 1.25 16.3
WTEMA 208 5, 1.15, 1.25 1.2 16.4
LWMA 416 5, 1.15, 1.25 1.17 16.3
GRIN 139 5, 1.15, 1.25 4.72 18.7
SMA 416 5, 1.15, 1.25 6.62 19.4
. ------ Attack Start: 125 ------
ASERT 416 5, 1.25, 1.35 0.38 22.4
ASERT 208 5, 1.25, 1.35 1.09 23.3
WTEMA 208 5, 1.25, 1.35 1.09 23.3
LWMA 416 5, 1.25, 1.35 1.1 23.4
GRIN 139 5, 1.25, 1.35 5.11 25.7
SMA 416 5, 1.25, 1.35 6.27 26.1
. ------ Attack Start: 135 ------
ASERT 416 5, 1.35, 1.45 0.36 29.2
ASERT 208 5, 1.35, 1.45 1.02 29.6
WTEMA 208 5, 1.35, 1.45 1.02 29.8
LWMA 416 5, 1.35, 1.45 1 29.5
GRIN 139 5, 1.35, 1.45 5.22 32.2
SMA 416 5, 1.35, 1.45 6.22 32.7
. ------ Attack Size:1000 ------
. ------ Attack Start: 95 ------
ASERT 416 10, 0.95, 1.05 1.39 1.7
ASERT 208 10, 0.95, 1.05 4.96 3.1
WTEMA 208 10, 0.95, 1.05 2.1 3.2
LWMA 416 10, 0.95, 1.05 2.03 3.1
GRIN 139 10, 0.95, 1.05 1.21 4
SMA 416 10, 0.95, 1.05 3.28 4.1
. ------ Attack Start: 105 ------
ASERT 416 10, 1.05, 1.15 0.05 9.5
ASERT 208 10, 1.05, 1.15 -0.33 10.5
WTEMA 208 10, 1.05, 1.15 0.56 10.6
LWMA 416 10, 1.05, 1.15 0.14 10.5
GRIN 139 10, 1.05, 1.15 3.68 12.8
SMA 416 10, 1.05, 1.15 3.79 13.5
. ------ Attack Start: 115 ------
ASERT 416 10, 1.15, 1.25 0.04 18
ASERT 208 10, 1.15, 1.25 1.45 19
WTEMA 208 10, 1.15, 1.25 1.57 19
LWMA 416 10, 1.15, 1.25 1.65 18.9
GRIN 139 10, 1.15, 1.25 4.46 21.8
SMA 416 10, 1.15, 1.25 6.66 22.5
. ------ Attack Start: 125 ------
ASERT 416 10, 1.25, 1.35 0.36 26.5
ASERT 208 10, 1.25, 1.35 1.17 27.2
WTEMA 208 10, 1.25, 1.35 1.37 27.3
LWMA 416 10, 1.25, 1.35 1.29 27.2
GRIN 139 10, 1.25, 1.35 4.76 30.3
SMA 416 10, 1.25, 1.35 6.15 30.7
. ------ Attack Start: 135 ------
ASERT 416 10, 1.35, 1.45 0.16 34.8
ASERT 208 10, 1.35, 1.45 0.89 35.5
WTEMA 208 10, 1.35, 1.45 1.02 35.7
LWMA 416 10, 1.35, 1.45 1.05 35.4
GRIN 139 10, 1.35, 1.45 5.19 38.5
SMA 416 10, 1.35, 1.45 5.92 39
jtoomim commented 4 years ago

Comments appreciated, especially if they're in time to update the article before wider dissemination:

https://read.cash/@jtoomim/bch-fork-proposal-use-asert-as-the-new-daa-1d875696

jacob-eliosoff commented 4 years ago

Had a quick skim - looks superb! I think this would be really positive contribution to BCH, and is already a great contribution to blockchain research. I'll try to take a closer look in the next few hours if I get a chance. Thanks @jtoomim.

jacob-eliosoff commented 4 years ago

Comments. I don't see anything requiring actual changes, the piece is good.

  1. Just for my reference, where could I read more about the "chain stalls completely if 100% of hashrate behaves rationally" claim? Is the idea that the longer they all hold out, the more money they all make?
  2. Under "reasons the bug is problematic" you could maybe also add that it leads to long stretches when expected confirmation time is worse than 2x (20 min). (I think.) But you do touch on it a bit later.
  3. "This instability can be avoided by having the influence of a block fade slowly over time" - yeah this is really the key change.
  4. Maybe mention explicitly somewhere the reason for the integer approximations: to avoid floating-point math and the hardware dependencies (and potential consensus failures) it introduces.
  5. "The debate between the two algorithms seemed to center around four issues" - your four issues look right to me. And indeed #⁠1 and #⁠2 are the most important. I'd argue #⁠2 (handling of edge cases) is the most important, the main reason I back asert[i].
  6. I'll just note that the for loop in exp_int_approx() is over (less than) the number of bits in the input. So it's a short loop, constant-time-ish. I still think your reasoning for choosing Mark's cubic is reasonable though.
  7. I have a vague preference for a shorter tau, because I think the benefits of responding quickly to actual sudden changes in hashrate/price - especially crisis situations - are underrepresented by the tests. I don't really see much downside to greater responsiveness except hashrate bouncing around a little more. So I'd lean towards a 1-day over a 2-day half-life. But this is just an "err in this direction" leaning, not something I've tested.
  8. I'm for lowering FTL (I think 2 hours is ridiculous) but of course there may be reasonable pushback against too many changes at once.
  9. Once we're considering travel at relativistic speeds we've probably been thorough enough!
zawy12 commented 4 years ago

To make sure I understand it, does the C++ code calculate 2^x and yet most of the testing above uses e^x so that N=288 in the testing above is N=288 * ln(2) = 200 in the code? Is N=288 in the code (as recommended in the article) actually N=416 if it was in terms of e^x?

If yes, then I'll agree N=288 half_life seems a little slow.

half_life = ln(2) * mean_lifetime

jtoomim commented 4 years ago

The python3 aserti3 implementation is also 2^x.

The C++ implementation's tau value is equivalent (with a small rounding error) to aserti3-416, which is this:

'tau': int(math.log(2) * IDEAL_BLOCK_TIME * 416),

That python3 version comes out to tau = 173009 seconds, whereas the C++ version is tau = 2*24*60*60 = 172800.

jtoomim commented 4 years ago

the "chain stalls completely if 100% of hashrate behaves rationally" claim

If the profitability is 95% for mining a block relative to BTC, it is more profitable for every miner to leave BCH for BTC. It is short-term irrational for any miner to mine a block at a short-term loss. In the absence of significant transaction fees, all non-RTT DAAs have this problem to some extent or another, but this miner irrationality is tested to a much greater extent with cw-144 than with a good DAA.

jacob-eliosoff commented 4 years ago

But if they all left then BCH would immediately become more profitable again so a bunch would come back... No? How does this lead to the chain stalling?

jtoomim commented 4 years ago

The profitability only goes up when the next block is mined. If all miners are 100% rational, that block never gets mined.

zawy12 commented 4 years ago

At the longer half life of 288, LWMA and WT don't do near as well as ASERT.

Std Dev of half life mean lifetime 288 verses 416 is 0.040 vs 0.033. Avg confirmation times under all attack conditions are about the same. Maximum increase revenue in 288 verses 416 in a very rough sense until many attack scenarios is 0.5% more revenue per time. The benefits of 416 over 288 seem minimal, but I do not have any reason to oppose 416.

jtoomim commented 4 years ago

Those were my thoughts too. Ultimately, what it came down to was that I wanted tau to be a nice round number in the C++ code, and with the 2^x form, exactly 2 days comes out to be "N"=415.5. That seemed just fine. "N"=207.75 seemed a little jumpy, but borderline acceptable. So I went with tau = 2 days rather than 1 day.

Std Dev of half life 288

Actually, the half-life of an e^(x/tau) EMA with tau=288600 is ln(2)*288\600, or 199.62 blocks. The number 288 in this context would be the "time constant" -- or, to be a bit less ambiguous, the natural exponential time constant.

The half-life of an 2^(x/tau) EMA with tau=600*288 is simply 600*288. In this context, tau represents the half-life, not the natural exponential time constant. The natural exponential time constant is equal to tau/ln(2), or 415.496 blocks.

Perhaps the issue here is that with base 2 EMAs, we should not be using "tau" as the name, but instead we should be using "lambda" to denote the half-life?

jacob-eliosoff commented 4 years ago

The profitability only goes up when the next block is mined. If all miners are 100% rational, that block never gets mined.

But if everyone stops mining the BCH chain, then don't its block rewards become much easier to mine, and a rational miner will swoop in for them? As you know, profitability is not just about difficulty, but about competition. When half the current BCH hashrate leaves, doesn't mining it yield twice the (gross) revenue?

jtoomim commented 4 years ago

But if everyone stops mining the BCH chain, then don't its block rewards become much easier to mine, and a rational miner will swoop in for them?

No. The difficulty of getting a block is determined by the mining target, and nothing else. Without RTT, the mining target does not update until after that first block has been mined at a loss. If everyone is 100% short-term rational, then nobody will mine that one block, and will simply wait for someone else to do it. Bystander effect.

Selfish mining can break this impasse. But selfish mining is not 100% short-term rational. It's a medium-term strategy.

jacob-eliosoff commented 4 years ago

About whether to parameterize based on 2^x or e^x - I went back and forth on this a few times back in the day. I ended up going for e^x because, eg, the EMA where a 1-day-old block has 1/e the weight of a current block, is in some important mathematical sense equivalent to a 1-day SMA - something about how e^x is the only exponential whose integral from -inf to 0 is 1. (The integral of 2^x from -inf to 0 is 1/ln(2) = 1.44, and a 1-day half-life EMA is comparable to a 1.44-day SMA.) But I'm having trouble remembering right now why that's desirable... Though as I recall, 1-day "1/e-life" EMAs did look more comparable on the charts (eg in responsiveness) to 1-day SMA algos like wt, than 1-day half-life EMAs did?

Anyway for our purposes here it doesn't seem like a big deal - between 2^x and e^x is a constant scaling after all. Just worth remembering any time we want to compare "equivalent" EMAs and SMAs.

jacob-eliosoff commented 4 years ago

No. The difficulty of getting a block is determined by the mining target, and nothing else.

OK, I see your point, but this is a questionable definition of short-term greedy. If one of us has to spend money to break down a door so we can all access a vault of gold, then in some theoretical sense it may be rational to wait for someone else to do it. But if no one else does it, it's more rational - even in the short term - to spend the money and break down the door, than to never get the gold.

zawy12 commented 4 years ago

Actually, the half-life of an e^(x/tau) EMA with tau=288600 is ln(2)288*600, or 199.62 blocks.

Yeah, I wrote half life when I meant mean lifetime.

we should not be using "tau" as the name, but instead

I hate long variable names, but in this case I would call tau "half_life_in_seconds". The main reason is to show what this constant is and it's relation to what we're doing. To show it's not just some arbitrary value we came up with. I never liked lambda or tau. Especially not lambda because the inverse of this constant is called lambda in both the exponential distribution (which is related to its use here) and the Poisson (which is more remotely related to its use here) and they are connected by this constant. We would be using this "λ" in this context like this: λ_exp = λ _poisson / λ_mean_lifetime_in_blocks

Developers should never choose coin total or emission rate, fees, block time, block size, etc. Those should be determined by proven theory that dictates the code. For example, if you include orphans via a DAG (measure DAG width and increase block time if avg width goes above some value like 1.1) or some other scheme, then you can measure propagation delays which then can continually adjust block time and block size. Market would determine coin emission and fees.

In this case, the ideal thing would be to try to let difficulty variation be an indication of exchange rate variation which would dictate half_life_in_seconds, assuming my theory that exchange rate and natural variation should be equal. Maybe more directly, the code should slowly search for a half_life that makes confirmation time as close to avg solvetime as possible. The developer would still have to choose a "rate of search" constant.

But getting back to reality, what we could do here is to write half_life as a function of expected daily or weekly exchange rate variation, if my theory that exchange rate and natural variation should be equal is correct (we make that proposal or assumption). Then we look at historical exchange rate data and determine the equation that gives the equivalent half_life. We can almost show ASERT is the one-and-only correct DA. With these 2 theories about what N should be and what the best math is, we've taken our opinions and testing out of the equation in selecting the DA. The code would just contain the theory and we just insert the historical exchange rate number.

zawy12 commented 4 years ago

Don't forget peer time revert rule has to be reduced from 70 minute to FTL/2 (or did someone tell me BCH does not do it like that?) and be prepared to immediately reject 2x and 1/2 limits or other clipping and median of 3.

@jtoomim What are your thoughts on the idea of using the longer form of ASERT

target_N+1 = target_MTP 2^([t_N− 6T − t_MTP ] / half_life_seconds )

to make it work with Poisson pill? It's interesting that it prevents a negative in the exponential by using MTP and yet it does not have a delay. Also, if the testnet is already using an RTT to go to easy difficulty if the timestamp is > 20 minutes and that presents a problem for ASERT, it seems like a good time to just switch the ASERT to an RTT for that condition.

zander commented 4 years ago

Can someone give me a simple reason why FTL was supposed to be changed?

The only reason I've read so far (from toomims blog) is an attack to lower the difficulty by mining in the future. This attack is something I don't follow. First the actual lowering of difficulty due to mining 70min in future is nothing spectacular and you still need a huge chunk of the hashpower for it to be relevant if you want to create an alternative chain with more actual PoW than the "normal" chain. I don't recall anyone actually posting numbers on this, though.

But more importantly, doing secret mining implies you are not held to the FTL rule until you broadcast your block(s). Making the FTL part of this traditional selfish mining attack a rather irrelevant one. You could trivially mine a block a day in the future and then mine for a full day until the time your first block becomes valid is reached and release all you have at that point. See: FTL part is irrelevant to this attack.

I don't know if FTL should be lowered, I'd love to see some rationale for it. So far I'm not sure I agree.

zawy12 commented 4 years ago

A secret mine has to comply with the FTL because the blocks will be rejected if the timestamps are past the FTL. With ASERT's half-life at 416 blocks a selfish mine can get 10 blocks at a 3.5% discount. The 3.5% is not much of a problem compared to the miner doing a 10-block withholding in the first place. So maybe there is no clear need to reduce the FTL unless there is some unforeseen attack. But what's the rationale for keeping it at 2 hours? The revert to peer time rule is 70 minutes which is different from FTL and needs to be about FTL/2 or have peer time removed.

zander commented 4 years ago

A secret mine has to comply with the FTL because the blocks will be rejected if the timestamps are past the FTL.

This is half true. In this case the qualification is super important. These blocks will be rejected only if they are past FTL at time of broadcast to the network. Yes, this cheating miner needs to alter his mining-node to disable FTL, but this task is not a deterrent to protect us from this attack.

With ASERT's half-life at 416 blocks a selfish mine can get 10 blocks at a 3.5% discount.

Thanks for sharing that, these numbers are useful to calculate the maximum damage.

But what's the rationale for keeping it at 2 hours?

That we don't just go changing things unless there is a demonstrated need to change them :)

So maybe there is no clear need to reduce the FTL

Ok, then we agree.

zawy12 commented 4 years ago

A secret mine has to comply with the FTL

This is half true. In this case the qualification is super important

The exception you have in mind is not possible in BCH because of 10-block rule, unless the 10-block rule is attacked. It can be important in coins that do not have the 10-block rule and use SMA or ASERT but not LWMA. In SMA the attacker can get 50% more blocks than the 100% of a simple private mine and 37% more in ASERT. It is super important in coins that have timespan limits (4x and 1/4 in BTC/LTC and 3x & 1/3 in Dash) and do not force sequential timestamps (like Digishield does indirectly with MTP) or otherwise limit how negative the "net solvetime" of each block is (compared to the previous block). In that attack, attacker can get unlimited blocks in < 3x the averaging window.

jtoomim commented 4 years ago

In telegram, zawy12 pointed out an off-by-one error in Mark Lundeberg's ASERT paper.

With off-by-one error in the exponent, 20k blocks: Algorithm Avg block interval (sec) Avg conf time (sec) Greedy Variable Steady Advantage
asert-144 599.48 752.22 -0.072% -0.012% -0.981% 0.969%
asert-288 598.87 652.21 -0.001% -0.014% -0.620% 0.619%
asert-407 598.42 643.52 0.022% -0.014% -0.604% 0.626%
aserti3-144 599.48 752.28 -0.072% -0.012% -0.981% 0.969%
aserti3-288 598.87 652.21 -0.001% -0.014% -0.620% 0.619%
aserti3-416 598.39 643.19 0.024% -0.014% -0.604% 0.628%
cw-144 604.69 1730.40 0.626% 0.108% -9.336% 9.962%
lwma-144 600.20 756.16 -0.069% -0.011% -1.003% 0.991%
lwma-288 599.90 658.75 -0.004% -0.014% -0.640% 0.635%
With fix: Algorithm Avg block interval (sec) Avg conf time (sec) Greedy Variable Steady Advantage
asert-144 599.45 752.23 -0.072% -0.012% -0.981% 0.970%
asert-288 598.84 652.22 -0.001% -0.014% -0.620% 0.619%
asert-407 598.39 643.53 0.022% -0.014% -0.605% 0.626%
aserti3-144 599.45 752.27 -0.072% -0.012% -0.981% 0.970%
aserti3-288 598.84 652.21 -0.001% -0.014% -0.620% 0.619%
aserti3-416 598.36 643.26 0.024% -0.014% -0.604% 0.628%
cw-144 604.69 1730.40 0.626% 0.108% -9.336% 9.962%
lwma-144 600.20 756.16 -0.069% -0.011% -1.003% 0.991%
lwma-288 599.90 658.75 -0.004% -0.014% -0.640% 0.635%

So it's pretty inconsequential in terms of performance. (It should only affect the first few blocks after the fork anyway.) But including the fix does save at least 4 bytes in the source code -- a pair of parantheses, and a +1 -- so the fix is clearly worthwhile for the disk space savings if nothing else. Western Digital hates me, I'm sure.

https://github.com/jtoomim/difficulty/commit/5427538f79894fc12414841ac2418ec7685cbdf3

jtoomim commented 4 years ago

So maybe there is no clear need to reduce the FTL

I agree. It's more of a want than a clear need.

aserti3-415.5 is pretty good against < 10 block selfish mining reorg attacks without an FTL reduction, so the FTL is not a dealbreaker. It helps a bit to reduce the FTL, and the costs are small, so I think it's a good idea to do it.

There's no need to make this a packaged deal. I think that the FTL should be reduced even if we stick with cw-144, and I think that we should switch to aserti3 or wtema even if we keep the FTL. There is a small synergy for an FTL reduction if we're going to be using an algorithm without a median-of-three prefilter, though.

A selfish miner can get a 2.8% difficulty reduction on the 2nd and later blocks in a ≥2-block secret chain/reorg attack with aserti3-416 or wtema-416. A selfish miner can get an 8.3% difficulty reduction on the 3rd and later blocks in a ≥3-block secret chain/reorg attack with cw-144+MTP3.

jacob-eliosoff commented 4 years ago

The usual tradeoff is 1. It's safer to roll out changes separately vs 2. It's easier (and in another sense safer) to fork twice than once. And the usual correct engineering choice is that 1 should take precedence. I think the current FTL is pointless but there's definitely a good argument for leaving out that change for now.

tromp commented 4 years ago

Okay, new settings for N etc:

wt-144: 'block_count': 144*2
lwma-144: 'n': 144*2,
asert-144: 'tau': (IDEAL_BLOCK_TIME * 144),
wtema-144: 'alpha_recip': 144, 

These settings result in the step response for all four algorithms clustering closely for each responsiveness setting:

image

N_wt = 2.64 * N = 549

This seems wrong. Including the extra 32% (190/144) seems to mess it up. If I use that 32% factor, WT ends up as a clear outlier on the impulse response. I've removed it.

dear Jonathan,

how can I reproduce these plots with the help of your comparison tool at https://github.com/jtoomim/difficulty/tree/comparator ? is this something you can define a scenario for?

jtoomim commented 4 years ago

Hi @tromp!

That image is just zoomed in a bunch on the revenue ratio graph. There's nothing special about it. Just do a simulation run with the algorithms you want, then take a look at the revenue ratio graph, and click-and-drag to zoom in on a spot where the price (black line) jumps suddenly.

E.g.:

image

Zooms to:

image

This zoom doesn't look as pretty as the one you quoted me as doing earlier, but that's just because it doesn't have as many algorithms, and one of them (cw-144) is wacky.

tromp commented 4 years ago

Here's one with several Grin variants:

Screen Shot 2020-08-02 at 11 38 41 AM

Grin has a damping factor in its DAA

damped_ts = (1 * delta_ts + (dampen - 1) * (n * IDEAL_BLOCK_TIME) ) // dampen

I previously assumed that an n-block d-damped filter had the same half-life as an n*d undamped filter, but the above responsiveness plot appears to contradict that, with e.g. grin-48x3 being noticeably slower than grin-144.

zawy12 commented 4 years ago

I previously assumed that an n-block d-damped filter had the same half-life as an n*d undamped filter,

I've claimed this in our conversations, so if the idea's wrong I can be blamed.

It depends on how you test it. They have the same std dev of difficulties if hashrate is constant. For smallish changes in hashrate I think they will respond similarly. For large changes in hashrate the dampening factor will have a longer delay than this implies to fully reach the change, but respond more in the short term, which gives it a better response than SMA due to the non-linear response of miners. That is, there might be 2x increase in hashrate if diff drops 5% and return to normal after a 10% rise. Dampening with the much smaller averaging window will do the 10% rise more quickly, but then it is stubborn to go higher. SMA responds more slowly then over-responds.

I don't understand how an SMA-144 (Grin-144-1) can respond fully in only 50 blocks. What is the 144? I don't like converting to half lives or even the more mathematically pure (e^x) mean lifetimes but just use averaging window lengths and use mean lifetimes for WTEMA/ASERT.

What is the averaging window for Grin-48-3? I didn't think it could drop off a cliff like that which is like an SMA.

zawy12 commented 4 years ago

The grin-48-3 was at a big disadvantage of being at the correct value before the step response. Immediately after the step it drops a lot quickly but then slows up. The gap increases at the end (which I described above as normally being beneficial).

tromp commented 4 years ago

I don't understand how an SMA-144 (Grin-144-1) can respond fully in only 50 blocks. What is the 144? I don't like converting to half lives or even the more mathematically pure (e^x) mean lifetimes but just use averaging window lengths and use mean lifetimes for WTEMA/ASERT.

What is the averaging window for Grin-48-3? I didn't think it could drop off a cliff like that which is like an SMA.

I 've been changing the naming convention from time to time. Originally, grin-n-d denoted an n block sma with damping factor d. So Grin's DAA was grin-60-3. Then I noticed the convention of having the first parameter denote the half-live, which for sma is only half the number of blocks, so I made grin-hxd denote an sma with 2*h blocks and damping factor d. That's the naming used in the picture above. Now that I see the poor correspondence, I will revert to the old naming scheme.

I think that any of wtema-120 (i.e. alpha = 1/120), wtema-180, or wtema-240, with a 2 to 4 hour half life, would be good choices for Grin. What do you think?

zawy12 commented 4 years ago

Yes, that's what I would use. WTEMA alpha=1/180 (mean lifetime-180) will have the stability of Grin-120-3 . It will be noticeably smoother even if you did not currently have oscillations caused be hashrate changes. Jonathan has been wanting 416 for BCH. I think he'll say 208 is at his lower end of acceptability. That is his position on BCH with good liquidity and 10 minute blocks. He might would choose a larger value for 1 minute blocks. I know my LWMA with N=60 has been at least 2x too fast for 2 minute blocks. I should have been using at least 120, and probably 180, if not 360. If LWMA N=180 were at the low end of what you need, especially since its 1 minute blocks, then that would imply WTEMA 1/alpha = 90. Given that Jonathan would opt for at least 200 and I feel safe with 180, I think 180 is good (which is like LWMA N=360). But it's easily possible that 360 better. I know from experience in LWMA that WTEMA-90 is going to work very well (better than my LWMA N=60's) and safe. I believe 2x will work better and still be safe especially with 60 second blocks. Jonathan's point of view is that he knows from testing 2x to 4x (if not higher) is better and safer.

Be aware that WTEMA has the same problem as relative ASERT if there are rounding errors. For example, there is a range of nBits that has an average of 1/2 of 2^(-15) truncation error (when it's 3 bytes are from 0x008000 to 0x008FFF). For WTEMA-180 this means 180 1/2 2^(-15) = 0.27% harder targets and therefore longer solvetimes. Notice the error is 2x if 360 is used. This is if target (or difficulty) is predominantly being rounded either up or down. This amplification of target error does not occur from solvetime error because it is divided by the 180 before it adjusts the target.

It should get ahead/behind of the emission schedule if hashrate keeps going up/down in the same way ASERT: blocks ahead = 180 * ln(future_difficulty/current_difficulty) This is not easy to correct. Notice the error is 2x if 360 is used.

I do not know what your future time limit (FTL) is (I'm sure we discussed it before you launched) but I would have it less than the target timespan of 10 blocks if not 1/3 of 1 block.

I want to describe my thinking on testing and choosing algorithms. Definitely WTEMA has the best simplicity and performance. The only drawback to LWMA is that a 50% attack for more coins can result in getting 138% instead of 100% of blocks in a given time. This is a real concern for some coins and I've looked for ways to reduce it.

I think the best algo is the one that has the fastest response for a given level of stability during constant hashrate. Response speed and stability are conflicting goals because they increase with N and 1/SQRT(N). I rank the algos according to those that have the fastest response to a step change in hashrate for a given Std Dev in difficulty when it has constant hashrate....as long as they don't overshoot. If you overshoot you can have an exponential increase in hashrate. So there should be a bias towards wanting more stability. Once the best algo is "determined", there's the question of how to balance stability and speed. Choosing N is a lot harder than choosing the algorithm. Intuition and experience tells me the stability during constant hashrate needs to be about 1/2 to 1/4 the Std Dev of the exchange rate changes (as it oscillates around an average value for a given week, begining and ending around the same exchange rate so that it does not included long-term upward trends that throw off Std Dev). The idea is that we want to respond as fast as possible without motivating switch mining just from the Std Dev of difficulty (random variation). Miners are looking for lowest difficulty/reward, so an exchange rate change is the same as a difficulty change in terms fo miner motivation. So you could look at historical exchange rate data and choose and N that gives 1/2 to 1/4 the exchange rante Std Dev.

Mine and Jonathan/Kyuupichan's testing of miner motivation are very different but generally have the same conclusions. The results seem consistent with the above.

jtoomim commented 4 years ago

I'd say the choice of time constant for wtema for grin would depend on whether grin is going to be sharing an ASIC hash function or the same hardware with other coins or not, what their relative sizes are, and what the total size of the grin mining economy is. Basically, the questions you need to be asking are:

How liquid is the hashpower market?

How much do you expect a 1% change in profitability to change hashrate by?

Do you expect hashrate vs profitability to be linear, quadratic, exponential, or a threshold function? I.e. do you expect pools like prohashing to jump in for a single block with 10x hashrate, then leave, or do you expect them to slowly allocate hashrate proportionally?

When coins get large, their miners tend to use smarter and more game theoretically optimal strategies for competitive hashrate markets. These tend to be more like the "variable" hashrate in my sim. But when coins are small, miners tend to use simple greedy strategies, which is what zawy's sims emphasize.

@tromp if you tell me a bit more about grin's current hash markets (or point me to an article or something), I can give you more specific recommendations for simulation parameters and for which types of algorithms are likely to suit you best.

@zawy12 Just because I like 416 for BCH does not mean I think it's the best choice for everyone. BCH is huge compared to most of the altcoins you deal with, and the hashrate market is very different. We're probably going to make far more similar recommendations on this one than you expect.