ledgerwatch / erigon

Ethereum implementation on the efficiency frontier
GNU Lesser General Public License v3.0
3.03k stars 1.05k forks source link

Downloader: wrong `complete` calc when node synced #9940

Open AskAlexSharov opened 2 months ago

AskAlexSharov commented 2 months ago

Existing node (has all files) already executing blocks - but still see much debug logs from downloader:

[DBUG] [04-15|08:53:03.543] [snapshots] webseed peers                file=history/v1-storage.2432-2464.v
[DBUG] [04-15|08:53:03.543] [snapshots] bittorrent peers             file=history/v1-storage.2432-2464.v erigon: 2.60.1-dev-c3824cf5=0B/s erigon: 2.60.1-dev-c3824cf5=0B/s
[DBUG] [04-15|08:53:03.544] [snapshots] progress                     file=idx/v1-accounts.2432-2464.ef progress=100.00% peers=0 webseeds=0

Reason is:

        var torrentComplete bool
        torrentName := t.Name()

        if _, ok := downloading[torrentName]; ok {
            torrentComplete = t.Complete.Bool()
        }

file - can be completed - even if we not downloaded it (start erigon on downloaded files).

AskAlexSharov commented 2 months ago

Probably this code if downloading && t.Complete.Bool() { in mainLoop is also buggy by same reason.

AskAlexSharov commented 2 months ago

Maybe let's remove d.downloading ? And new completeness markers in db

mh0lt commented 2 months ago

We can do this. It will just re-introduce premature downloader completion.

The underlying issue here is that Torrent.Complete is itself unreliable.

I think a better way of doing this is to examine. Torrent._completedPieces which is the actual source of truth and doesn't depend on the torrent code remembering to update Complete - which I don't think it always does correctly.

Lets wee what bugs emerge from re-testing post this change.

Or use:

func (t *Torrent) pieceCompleteUncached(piece pieceIndex) storage.Completion {
    if t.storage == nil {
        return storage.Completion{Complete: false, Ok: true}
    }
    return t.pieces[piece].Storage().Completion()
}

for all pieces. Which may be too slow.

The problem is likely to result from this code:

func (t *Torrent) haveAllPieces() bool {
    if !t.haveInfo() {
        return false
    }
    return t._completedPieces.GetCardinality() == bitmap.BitRange(t.numPieces())
}

Returning true when it should be false.

But I don't know when this happens