input-output-hk / jormungandr

privacy voting blockchain node
https://input-output-hk.github.io/jormungandr/
Apache License 2.0
366 stars 132 forks source link

Lack of sync in relay scenario #2719

Open dkijania opened 3 years ago

dkijania commented 3 years ago

Describe the bug In relay_soak scenario there is frequent error in logs which is causing lack of sync between relay and core nodes:

Nov 13 17:39:49.536 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block a870ff8a67340fae1740d78c1386deb317ce42957b7a31e96adf380a4bbd09a1 of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8019, peer: 127.0.0.1:8019, task: network
Nov 13 17:39:49.552 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block a870ff8a67340fae1740d78c1386deb317ce42957b7a31e96adf380a4bbd09a1 of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8011, peer: 127.0.0.1:8011, task: network
Nov 13 17:39:49.583 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block a870ff8a67340fae1740d78c1386deb317ce42957b7a31e96adf380a4bbd09a1 of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8009, peer: 127.0.0.1:8009, task: network
Nov 13 17:39:49.628 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block 2a56fa21c194aad43ae6df35ff06d602b514df3c15be60ca5b6e26ccb7aaf1da of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8001, peer: 127.0.0.1:8001, task: network
Nov 13 17:39:49.643 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block 9325ebd5b589b93b7cff266cc9e6db52655b6500dd80cb7b38e904c091854250 of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8009, peer: 127.0.0.1:8009, task: network
Nov 13 17:39:49.674 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block 9325ebd5b589b93b7cff266cc9e6db52655b6500dd80cb7b38e904c091854250 of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8011, peer: 127.0.0.1:8011, task: network
Nov 13 17:39:50.049 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block 9325ebd5b589b93b7cff266cc9e6db52655b6500dd80cb7b38e904c091854250 of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8019, peer: 127.0.0.1:8019, task: network
Nov 13 17:39:50.049 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block a9d833b523168effe5be2e6db3f1eaf96a3c5141fe98b236c29b6bba6fc7fee3 of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:50 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8001, peer: 127.0.0.1:8001, task: network
Nov 13 17:39:50.051 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block 9325ebd5b589b93b7cff266cc9e6db52655b6500dd80cb7b38e904c091854250 of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8003, peer: 127.0.0.1:8003, task: network
Nov 13 17:39:50.064 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block 2a56fa21c194aad43ae6df35ff06d602b514df3c15be60ca5b6e26ccb7aaf1da of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8011, peer: 127.0.0.1:8011, task: network
Nov 13 17:39:50.124 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block 2a56fa21c194aad43ae6df35ff06d602b514df3c15be60ca5b6e26ccb7aaf1da of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:49 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8009, peer: 127.0.0.1:8009, task: network
Nov 13 17:39:50.605 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block fc50df2b4d1dff2a2c55aa5165a24b2335f5894e5b9c40fe1850470d4e5b358f of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:50 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8003, peer: 127.0.0.1:8003, task: network
Nov 13 17:39:50.605 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block fc50df2b4d1dff2a2c55aa5165a24b2335f5894e5b9c40fe1850470d4e5b358f of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:50 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8007, peer: 127.0.0.1:8007, task: network
Nov 13 17:39:50.650 INFO PushHeaders request failed, error: Error { code: FailedPrecondition, source: Status { code: FailedPrecondition, message: "system state does not permit the operation (the parent block a9d833b523168effe5be2e6db3f1eaf96a3c5141fe98b236c29b6bba6fc7fee3 of the first received block header is not found in storage)", metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 13 Nov 2020 16:39:50 GMT"} } } }, peer_address: /ip4/127.0.0.1/tcp/8019, peer: 127.0.0.1:8019, task: network
Nov 13 17:39:59.021 INFO received block announcement from network, from_node_id: /ip4/127.0.0.1/tcp/51537, date: 0.5, parent: e972095abc1760f60c797c9fa5d451cd38bafbc1d2ea398b426cecafd2f8e469, hash: 80193068afeef189a77992b52ee33b95acda4a12b97287c30d05164ef06dd87e, task: block
Nov 13 17:39:59.024 INFO receiving block stream from network, task: block
Nov 13 17:39:59.048 INFO switching branch from 99b6f6e1-00000004-0.4 to 80193068-00000005-0.5, task: block 

Which is causing lack of sync before nodes, issue is observed in Relay1 and Relay2 nodes.

Topology looks like below:

image

Mandatory Information

  1. jcli --full-version newest master;
  2. jormungandr --full-version newest master;

To Reproduce Steps to reproduce the behavior:

  1. run scenario relay_soak

Expected behavior A clear and concise description of what you expected to happen.

Additional context Full logs: Relay-Soak.zip

eugene-babichenko commented 3 years ago

What is happening there

The lack of sync might be due to a mistake in the header request/push sequence design. This mistake manifests itself as a bunch of FailedPrecondition errors in PushHeaders requests.

The idea of what is happening is the following sequence:

  1. Alice sends Bob a request for missing headers with the following parameters: to: BlockA, checkpoints: [BlockB, BlockC].
  2. Bob doesn't have neither BlockB nor BlockC or they are not valid ancestors of BlockA.
  3. Instead of ignoring this request Bob just sends BlockA and nothing more.
  4. This triggers FailedPrecondition on Alice's side because she doesn't have the parent for BlockA and it was not included into the response.

This situation is the result of not requesting missing parents and having a set of checkpoints that is not valid for such request.