ZcashFoundation / zebra

Zcash - Financial Privacy in Rust 🦓
https://zfnd.org/zebra/
Apache License 2.0
410 stars 100 forks source link

`zebrad` syncer hangs when there are no peers, despite network timeouts #1633

Closed teor2345 closed 3 years ago

teor2345 commented 3 years ago

Version

zebrad a merge of PRs #1620 and #1622, around 2021-01-22.

Platform

Linux ... 5.4.89 #1-NixOS SMP Tue Jan 12 19:16:25 UTC 2021 x86_64 GNU/Linux

Description

The zebrad syncer hangs when there are no peers, despite the timeouts on tip_network, sync::downloads.network, and sync::downloads.verifier.

I tried this:

Running zebrad on mainnet and testnet.

I expected to see this happen:

zebrad continues trying to sync, even if it has no peers.

Instead, this happened:

The ChainSync component stopped logging output, until a peer connected to the zebrad Zcash listener port, but then it went back to hanging after a few minutes.

Analysis

This bug is probably caused by using ready_and or poll_ready on the peer set, which doesn't become ready until peers are ready. I'm not sure if this is actually a bug - it might be correct behaviour.

This bug seems to be a more specific case of the network hangs in #1435. It could be caused by buffer misuse in #1593

Commands

zebrad start

Logs

Jan 25 12:53:22.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 12:54:37.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 12:55:52.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 12:56:40.267  INFO {zebrad="a49f4943" net="Main"}:peer{addr=127.0.0.1:40296}:msg_as_req{msg=getheaders}:state: zebra_state::service: responding to peer GetBlocks or GetHeaders final_height=Height(1119240) response_len=158 chain_tip_height=Height(1122481)
stop_height=None intersection_height=Some(Height(1119082))
Jan 25 12:56:46.288  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 12:56:46.288  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 12:57:47.289  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 12:57:47.289  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Height(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 12:57:47.290  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 1 peers to answer requests
Jan 25 12:57:53.291  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 12:57:53.291  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 12:58:54.292  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 12:58:54.292  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Height(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 12:58:54.292  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 1 peers to answer
 requests
Jan 25 12:59:00.294  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 12:59:00.294  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 13:00:01.295  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 13:00:01.295  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Heigh
t(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 13:00:01.295  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips: zebra_network::peer_set::set: network request with no ready peers: finding more peers, waiting for 1 peers to answer
 requests
Jan 25 13:00:07.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: exhausted prospective tip set
Jan 25 13:00:07.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 13:01:08.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 13:00:07.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: waiting to restart sync timeout=61s
Jan 25 13:01:08.297  INFO {zebrad="a49f4943" net="Main"}:sync: zebrad::components::sync: starting sync, obtaining new tips
Jan 25 13:01:08.298  INFO {zebrad="a49f4943" net="Main"}:sync:obtain_tips:state: zebra_state::util: created block locator tip_height=Height(1122481) min_locator_height=1122382 locators=[Heigh
t(1122481), Height(1122480), Height(1122479), Height(1122477), Height(1122473), Height(1122465), Height(1122449), Height(1122417), Height(1122382)]
Jan 25 13:02:07.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:03:22.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:04:37.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:05:52.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:07:07.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:08:22.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
Jan 25 13:09:37.299  WARN crawl_and_dial: zebra_network::peer_set::set: network request with no peer connections. Hint: check your network connection
teor2345 commented 3 years ago

This bug is caused by #1435, it will probably be fixed by #1709.

But we should add some logs so future hangs are easier to diagnose - see #1714.