near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.34k stars 631 forks source link

State sync error message is confusing #3152

Open frol opened 4 years ago

frol commented 4 years ago

Describe the bug

Running a node from master I see these messages after the headers sync:

 Aug 12 16:22:20.886 INFO stats: #10885359 Downloading headers 98% -/4 30/28/40 peers ⬇ 566.3kiB/s ⬆ 25.2kiB/s 0.00 bps 0 gas/s CPU: 51%, Mem: 4.3 GiB
 Aug 12 16:22:31.397 INFO stats: #10885359 Downloading headers 99% -/4 30/28/40 peers ⬇ 574.6kiB/s ⬆ 19.9kiB/s 0.00 bps 0 gas/s CPU: 75%, Mem: 4.3 GiB
 Aug 12 16:22:41.865 INFO stats: #10885359 Downloading headers 99% -/4 30/28/40 peers ⬇ 592.1kiB/s ⬆ 22.6kiB/s 0.00 bps 0 gas/s CPU: 76%, Mem: 4.3 GiB
 Aug 12 16:22:52.118 INFO stats: #10885359 Downloading headers 99% -/4 30/28/40 peers ⬇ 583.8kiB/s ⬆ 25.9kiB/s 0.00 bps 0 gas/s CPU: 60%, Mem: 4.3 GiB
 Aug 12 16:23:04.938 INFO stats: State <span class="error">[0: header]</span> -/4 30/28/40 peers ⬇ 625.5kiB/s ⬆ 26.3kiB/s 0.00 bps 0 gas/s CPU: 76%, Mem: 4.3 GiB
 Aug 12 16:23:15.175 WARN sync: State sync didn't download the state for shard 0 in 10 seconds, sending StateRequest again
 Aug 12 16:23:15.276 INFO stats: State <span class="error">[0: parts]</span> -/4 30/28/40 peers ⬇ 3.7MiB/s ⬆ 25.3kiB/s 0.00 bps 0 gas/s CPU: 177%, Mem: 4.5 GiB
 Aug 12 16:23:25.439 WARN sync: State sync didn't download the state for shard 0 in 10 seconds, sending StateRequest again
 Aug 12 16:23:26.027 INFO stats: State <span class="error">[0: parts]</span> -/4 30/27/40 peers ⬇ 8.5MiB/s ⬆ 23.6kiB/s 0.00 bps 0 gas/s CPU: 169%, Mem: 4.5 GiB
 Aug 12 16:23:35.439 WARN sync: State sync didn't download the state for shard 0 in 10 seconds, sending StateRequest again
 Aug 12 16:23:36.030 INFO stats: State <span class="error">[0: parts]</span> -/4 30/27/40 peers ⬇ 9.4MiB/s ⬆ 21.5kiB/s 0.00 bps 0 gas/s CPU: 30%, Mem: 4.5 GiB
 Aug 12 16:23:58.577 INFO stats: State <span class="error">[0: done]</span> -/4 30/27/40 peers ⬇ 9.4MiB/s ⬆ 21.4kiB/s 0.00 bps 0 gas/s CPU: 100%, Mem: 4.8 GiB
 Aug 12 16:23:58.678 ERROR sync: State sync received hash 9YCa51LtyuYVL7rqP3KgaSAGqtywTpmLHm6eEP7LavGE that we're not expecting, potential malicious peer
 Aug 12 16:23:58.687 ERROR sync: State sync received hash 9YCa51LtyuYVL7rqP3KgaSAGqtywTpmLHm6eEP7LavGE that we're not expecting, potential malicious peer
 Aug 12 16:23:58.689 ERROR sync: State sync received hash 9YCa51LtyuYVL7rqP3KgaSAGqtywTpmLHm6eEP7LavGE that we're not expecting, potential malicious peer

It proceeded with block sync, so I assume it is not critical, yet it might be an indicator of a potential issue.

To Reproduce

I am not sure how reproducible this behavior, but there was no data folder and I just let it sync up.

Expected behavior

I think it might be an indicator of some bug even though it did not crash.

Screenshots

![image](https://user-images.githubusercontent.com/304265/90041146-42fcdc80-dc9f-11ea-818b-2171b91f5b70.png)

Version (please complete the following information):

bowenwang1996 commented 4 years ago

I am not sure what is the issue here. Are you referring to something in the logs? I agree that we should probably suppress the warning as it is unlikely to finish downloading state in 10 seconds.

frol commented 4 years ago

The node was running fine and I expect it not to produce ERROR message unless those need to be addressed. I still think those messages are valiable, so we may just transform those to INFO level. I am not sure about the "ERROR sync: State sync received hash 9YCa51LtyuYVL7rqP3KgaSAGqtywTpmLHm6eEP7LavGE that we're not expecting, potential malicious peer" error as it appears to me as some sort of race condition given that is just disapeared a minute later.

bowenwang1996 commented 4 years ago

That happens when a peer is behind the head of the network or the head is close to the epoch boundary. I can change the log level.

frol commented 4 years ago

That happens when a peer is behind the head of the network or the head is close to the epoch boundary. I can change the log level.

I think those should be DEBUG level logs if you can actually test those conditions you listed, and if those conditions are not met that needs to be reported as WARNING.

Let's reserve ERRORs to the items that require some action.

I think it is a low priority.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

singhparshant commented 1 year ago

Hi, I am running a near localnet on 2 machines and getting the same error on the 2nd node. Is this known issue or it takes sometime to sync ?

2023-01-06T16:25:30.043449Z WARN sync: State sync didn't download the state for shard 0 in 60 seconds, sending StateRequest again

akhi3030 commented 1 year ago

CC: @nikurt: could you take a look please?