Closed theoreticalbts closed 3 years ago
This seems weird, because it should be triggering a rescan every 20 ms, which should cause the download manager to download the missing block from someone else.
The error message here is triggering, and then the very next thing we do is a select which selects the ctx.Done()
case, which we know doesn't block because the context was already reported to be done by the error message.
So the downloadResponseChan
never gets any response, leading the BlockDownloadManager
to believe the download is still in progress. Forever. The result is a stuck sync.
What does happen is that the PeerHandlerLoop
, running in a separate goroutine, picks up on the closed context here and sends it to the PeerIsClosedChan
, which is drained by the BlockDownloadManager
here.
The solution is then for BlockDownloadManager
to clear the peer's in-flight requests when it receives a peer disconnection notification on PeerIsClosedChan
. I think we only need to clear requests that are in Downloading
, as requests that have made it to Applying
or WaitingToApply
shouldn't need the peer to be connected to finish their processing.
I spun up a new node and the following sequence of events occurred:
QmcgPdf...
peerQmcg
gossips block 76176 to me.Qmcg
.Related: #144
Logs: