orbs-network / orbs-network-go

Orbs node virtual chain core reference implementation in Go
MIT License
47 stars 12 forks source link

Block storage / Block sync fragile on persistence writes #1252

Open jlevison opened 5 years ago

jlevison commented 5 years ago

Describe the bug

We've seen an error such as failed to flush blocks to disk: sync /usr/local/var/orbs/blocks: input/output error during sync, coming from services/blockstorage/internodesync/state_processing_blocks.go:78 ('message' will be failed to commit block received via sync

While spec wise / flow wise its valid to break the sync process, it is probably too fragile, this is part of the deadlock issue we had (see slack),

Steps To Reproduce

Steps to reproduce the behavior: fail a write during sync..

Expected behavior

perhaps some one off retry, maybe depending on the specific error, try to understand what kind of errors the flush can return to decide if some effort should be invested here to begin with

ronnno commented 4 years ago

@jlevison if this happens on our audit-mode node in production - have you seen any indication for the reason of the failure in system monitors? does it saturate iops capacity? what is the resource utilization of the node at this time?

How often does this happen? does it happen during random syncs or during first time massive syncs? etc...

netoneko commented 4 years ago

@gadcl do you think this issue was addressed?