spacemeshos / bug-reports

Issue tracking for community-generated bug reports
Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

Constantly unsyncing, incorrect verified layer, very large logs #5

Open yaelmhoffman opened 3 years ago

yaelmhoffman commented 3 years ago

User installed and ran the app. It synced successfully, but with an incorrect verified layer (2013, when the current layer is 6000+), and periodically unsyncs. Logs are excessively large.

Logs: https://drive.google.com/file/d/1mj6xGhN5wD58T1xjq8hnxFWUCuxjOecG/view?usp=sharing

lrettig commented 3 years ago

It appears that this node really struggled to sync. It had a small number of peer connections, and for some reason, was unable to get blocks for thousands of layers: every time it requested blocks from its peers, they returned nil. Eventually the user restarted the node, which caused it to find better peers, and it began syncing again, but eventually the same thing happened again. The large size of the logs is basically due to a massive amount of warnings about missing data, as a result of the sync failures.

The node may have gotten very unlucky with its peers, or it could (hypothetically, unlikely) have been eclipse attacked, or it could have been some other networking issue that caused it to be unable to exchange data with peers. We need to improve our p2p stack so that bad peers get dropped and replaced, which should make this sort of thing much less likely. (Related: https://github.com/spacemeshos/go-spacemesh/issues/2385)

lrettig commented 3 years ago

Later logs from the same node: https://drive.google.com/file/d/1Ku0OI-IaYzx632_zD0utwJg7z-psNVYz/view?usp=sharing

But there is still a ~24 hr gap between the two sets of logs, which is when things went crazy