filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.85k stars 1.27k forks source link

Daemon/miner cannot be restarted properly and caught up in drand fetching randomness #4096

Closed jennijuju closed 3 years ago

jennijuju commented 4 years ago

Lotus version: Daemon: 0.8.0+debug+git.2c1d96bc.dirty+api0.16.0 Local: lotus version 0.8.0+debug+git.2c1d96bc.dirty

Running on devnet, restarting the daemon after the daemon was stopped without miner being stoped, the daemon is stuck with looping

2020-09-28T19:23:22.023-0400    INFO    drand   log/log.go:109      {"level": "info", "optimizing_client": "watch ended", "client": "HTTP(\"https://api.drand.sh/\").(+verifier)"}
2020-09-28T19:23:23.139-0400    INFO    drand   drand/drand.go:136  start fetching randomness   {"round": 196806}
2020-09-28T19:23:23.139-0400    INFO    drand   drand/drand.go:146  done fetching randomness    {"round": 196806, "took": 0.000054153}

with miner got

2020-09-28T19:25:43.221-0400    WARN    miner   miner/miner.go:215  BestMiningCandidate from the previous round: [bafy2bzacebkcqpntw2zlt4j7tx52fc3sljpez6rpdkjhuswcyhba2iahlnz36] (nulls:0)```
 and no longer producing blocks

and then restarting the miner got

2020-09-28T19:26:24.963-0400 ERROR miner miner/miner.go:268 <!!> SLASH FILTER ERROR: produced block would trigger 'double-fork mining faults' consensus fault; miner: t01000; bh: bafy2bzacec2b4kapukviy5f4fc6pq7v7m5welstomrejasdt6khyw76ox6oac, other: bafy2bzacec2b4kapukviy5f4fc6pq7v7m5welstomrejasdt6khyw76ox6oac

damonhung commented 3 years ago

I just encountered this, I think. My miner went on a three-day binge of randomness, fetching it ~100 times per second throughout the whole period. The log shows the miner continued to operate... sort of. Lots of errors, though. Partial log attached. Please let me know if you would like complete logs. I will keep them for a while but not forever.

dhh-4096-log1.txt

As reported, restarting the daemon didn't help. Pulling a new snapshot did work.

Edit: I also kept my old chain directory, if that would help.