HathorNetwork / hathor-core

Hathor core is the official and reference client for operating a full node in Hathor Network.
https://hathor.network
Apache License 2.0
83 stars 26 forks source link

[Design] Fix "can start mining" condition (aka "is sync in progress") #152

Open jansegre opened 4 years ago

jansegre commented 4 years ago

Context

This issue brings up problems @msbrogli and I found when making a thorough review of #94 and proposes a more comprehensive take to solve this problem without causing unacceptable setbacks.

The main problem that arose from #94 was altering hathor.p2p.node_sync.NodeSyncTimestamp.is_synced with the purpose of improving hathor.p2p.manager.ConnectionsManager.has_synced_peer, however it is also used in hathor.p2p.node_sync.NodeSyncTimestamp.send_tx_to_peer_if_possible, which has a more precise requirement and because of a misunderstanding of what the method was supposed to do, it was erroneously "fixed" a couple times during #94.

Splitting is_synced into two methods is only a slight part of the solution, because the bigger picture encompasses other problems with the current approach. The testing improvements made from #94 have been salvaged on #151, as well as improved docstring on is_synced to help clarify its intended function.

The Design

The primary goals for this design are to:

  1. be a superior alternative to --allow-mining-without-peers, otherwise people will use this flag and we don't want that;
  2. still work if the hashrate drops substantially for an extended period of time;
  3. not be vulnerable to stalling by peers with faking data;
  4. be fast, that is, not taking longer than it does to sync with the rest of the network (a condition that goes hand-in-hand, that is simple to define, but hard-to-impossible to measure reliably (loops back to this very problem)).
  5. minimize the change of producing orphan blocks

Proposal (A v1)

There is only a single transition (out of sync -> in sync). We can extend this in the future to support the reverse (in sync -> out of sync) by detecting network issues.

A time duration T will be defined. When latest_timestamp + T > current_timestamp the transition is triggered.

Discarded proposal (B)

Use the activity of the downloader to estimate when we have reached sync and are just receiving new stuff. Currently this is very clearly distinguished by looking at the download queue. A potential problem is when the network is very busy, our parameters may never detect "synced" state.

Discarded proposal (C)

Estimate the probability of there not being new blocks based on the last block we received (and considering that any node can mine on the block-min-diff), and including a propagation time (or timeout). Mathematically this would be a function f(p, t), p being the probability of finding a block, and t being the time since the last block, f is a boolean (alternatively it can be a float in [0, 1] that we can collapse to a boolean). The main downside is that it is a bit convoluted and there is no proposal to what f would look like.

obiyankenobi commented 3 years ago

When latest_timestamp + T > current_timestamp the transition is triggered

@jansegre can you define latest_timestamp and current_timestamp?

obiyankenobi commented 3 years ago

Sounds good enough to me. As I understand, T will have to be pretty large to allow for hashrate drops. Do you have a number in mind?

obiyankenobi commented 3 years ago

Moving this to backlog as we're changing the sync algorithm and it will impact this.

jansegre commented 1 week ago

This has changed after sync-v2, but it would be useful to review the "can start mining" condition, although it has been very stable.