archethic-foundation / archethic-node

Official Archethic Blockchain node, written in Elixir
GNU Affero General Public License v3.0
69 stars 21 forks source link

Node deadlock when rolling self repair fails #1531

Open bchamagne opened 2 weeks ago

bchamagne commented 2 weeks ago

Describe the problem you discovered

    2024-06-11 10:57:50.658     
2024-06-11 08:57:50.655 [error] Child Archethic.Bootstrap of Supervisor Archethic.Supervisor terminated

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
** (exit) an exception was raised:

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
    ** (Archethic.SelfRepair.Error) Self repair encounter an error in function verify_attestation with error Threshold error in self repair on attestation %Archethic.BeaconChain.ReplicationAttestation{transaction_summary: %Archethic.TransactionChain.TransactionSummary{timestamp: ~U[2024-06-11 08:57:11.246Z], address: <<0, 0, 203, 166, 103, 122, 115, 133, 110, 206, 231, 96, 36, 122, 9, 228, 216, 84, 147, 16, 31, 90, 22, 207, 221, 116, 183, 171, 153, 71, 36, 47, 192, 254>>, type: :contract, fee: 0, validation_stamp_checksum: <<163, 80, 169, 250, 38, 111, 148, 141, 56, 242, 214, 19, 181, 219, 4, 26, 185, 151, 155, 124, 100, 134, 107, 151, 23, 125, 139, 164, 162, 118, 126, 39>>, genesis_address: <<0, 0, 174, 85, 162, 53, 126, 35, 1, 20, 119, 72, 251, 21, 145, 183, 167, 165, 195, 16, 183, 56, 173, 92, 248, 165, 11, 186, 242, 173, 211, 71, 17, 190>>, movements_addresses: [], version: 3}, confirmations: [{9, <<152, 137, 8, 189, 73, 248, 144, 194, 248, 139, 97, 140, 171, 147, 203, 188, 89, 158, 130, 164, 144, 232, 63, 248, 13, 169, 229, 197, 194, 73, 60, 49, 20, 3, 49, 222, 146, 223, 201, 110, 44, 99, 174, 183, 55, ...>>}, {4, <<117, 174, 46, 160, 152, 85, 151, 48, 232, 173, 108, 36, 136, 128, 223, 153, 133, 170, 45, 232, 36, 83, 189, 220, 165, 163, 127, 192, 14, 125, 45, 53, 109, 117, 57, 238, 226, 26, 103, 18, 185, 65, 171, 133, ...>>}, {3, <<28, 160, 34, 217, 114, 246, 28, 239, 10, 134, 246, 39, 51, 166, 223, 88, 231, 168, 128, 211, 48, 147, 3, 73, 184, 227, 5, 178, 215, 120, 185, 252, 201, 63, 209, 88, 94, 140, 185, 250, 196, 83, 225, ...>>}], version: 2}

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
        (archethic 1.5.3) lib/archethic/self_repair/sync/transaction_handler.ex:302: Archethic.SelfRepair.Sync.TransactionHandler.verify_attestation/1

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
        (archethic 1.5.3) lib/archethic/self_repair/sync/transaction_handler.ex:181: Archethic.SelfRepair.Sync.TransactionHandler.process_transaction_data/5

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
        (archethic 1.5.3) lib/archethic/self_repair/sync.ex:490: anonymous fn/3 in Archethic.SelfRepair.Sync.synchronize_transactions/2

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
        (elixir 1.14.1) lib/stream.ex:481: anonymous fn/4 in Stream.each/2

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
        (elixir 1.14.1) lib/task/supervised.ex:370: Task.Supervised.stream_deliver/7

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
        (elixir 1.14.1) lib/stream.ex:1811: Enumerable.Stream.do_each/4

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
        (elixir 1.14.1) lib/stream.ex:689: Stream.run/1

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
        (archethic 1.5.3) lib/archethic/self_repair/sync.ex:364: anonymous fn/2 in Archethic.SelfRepair.Sync.process_replication_attestations/2

2024-06-11 10:57:50.658 instance=frankfurt_testnet_node 
Pid: #PID<0.27905.5759>

Describe the solution you'd like

implement same retry behaviour as normal self repair

Epic

No response