filecoin-project / filecoin-chain-archiver

Filecoin snapshot / chain export software
Other
5 stars 3 forks source link

Export doesn't fail if node goes away #6

Closed travisperson closed 2 years ago

travisperson commented 2 years ago

Run an export and kill the nodes when the first update occurs.

2022-04-27T00:35:00.712Z        INFO    filsnap/cmds    cmds/create.go:199      snapshot        {"snapshot_height": "233660", "current_height": "233664", "confidence_height": "233665", "run_at": "2022-04-27T00:35:30.000Z"}
2022-04-27T00:36:30.010Z        INFO    filsnap/cmds    cmds/create.go:273      update  {"total": 0, "speed": 0}
2022-04-27T00:37:30.010Z        INFO    filsnap/cmds    cmds/create.go:273      update  {"total": 0, "speed": 0}
2022-04-27T00:38:30.011Z        INFO    filsnap/cmds    cmds/create.go:273      update  {"total": 0, "speed": 0}
2022-04-27T00:39:30.012Z        INFO    filsnap/cmds    cmds/create.go:273      update  {"total": 0, "speed": 0}
2022-04-27T00:40:30.013Z        INFO    filsnap/cmds    cmds/create.go:273      update  {"total": 0, "speed": 0}
2022-04-27T00:40:31.218Z        ERROR   filsnap/cmds    cmds/create.go:238      error   {"err": "context deadline exceeded"}
2022-04-27T00:41:00.218Z        ERROR   rpc     go-jsonrpc@v0.1.5/websocket.go:667      Connection timeout      {"remote": "10.96.52.94:1234"}
2022-04-27T00:41:30.014Z        INFO    filsnap/cmds    cmds/create.go:273      update  {"total": 0, "speed": 0}
2022-04-27T00:41:30.218Z        ERROR   rpc     go-jsonrpc@v0.1.5/websocket.go:667      Connection timeout      {"remote": "10.96.52.94:1234"}
2022-04-27T00:42:00.219Z        ERROR   rpc     go-jsonrpc@v0.1.5/websocket.go:667      Connection timeout      {"remote": "10.96.52.94:1234"}
2022-04-27T00:42:30.014Z        INFO    filsnap/cmds    cmds/create.go:273      update  {"total": 0, "speed": 0}
2022-04-27T00:42:30.220Z        ERROR   rpc     go-jsonrpc@v0.1.5/websocket.go:667      Connection timeout      {"remote": "10.96.52.94:1234"}
2022-04-27T00:43:00.221Z        ERROR   rpc     go-jsonrpc@v0.1.5/websocket.go:667      Connection timeout      {"remote": "10.96.52.94:1234"}
travisperson commented 2 years ago

The {"err": "context deadline exceeded"} error is from the call to waitAPIDown and waitAPI which have a 5 minute (hardcoded at the moment) timeout.

https://github.com/travisperson/filsnap/blob/a66499babf31e509d4444ca3e3066055d53c3728/pkg/export/export.go#L131-L148

The MinIO PutObject causes the hang because the output is never closed by Export due to the defer to close the output being set after the calls to waitAPIDown and waitAPI.