AleoNet / snarkOS

A Decentralized Operating System for ZK Applications
http://snarkos.org
Apache License 2.0
4.37k stars 2.64k forks source link

[Bug] Malicious validators can launch DDoS attacks by constructing and broadcasting fake solutions, severely harming network performance #3290

Open elderhammer opened 6 months ago

elderhammer commented 6 months ago

Steps to Reproduce

Use this branch to run a malicious validator: https://github.com/elderhammer/snarkOS/tree/fake_solution_transmission

Part one:

  1. Malicious validator use latest_epoch_hash to construct fake solutions
  2. Malicious validator changes the frequency of broadcasting WorkerPing to once every 100ms
  3. Honest validator receives the WorkerPing and execute process_transmission_id_from_ping
    2024-07-03T12:59:12.404683Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'WorkerPing' from '127.0.0.1:5004'
    2024-07-03T12:59:12.404979Z TRACE snarkos_node_bft::gateway: [MemoryPool] Sending 'TransmissionRequest' to '127.0.0.1:5004'
    2024-07-03T12:59:12.404993Z TRACE snarkos_node_bft::gateway: [MemoryPool] Sending 'TransmissionRequest' to '127.0.0.1:5004'
    2024-07-03T12:59:12.405000Z TRACE snarkos_node_bft::gateway: [MemoryPool] Sending 'TransmissionRequest' to '127.0.0.1:5004'
    2024-07-03T12:59:12.405006Z TRACE snarkos_node_bft::gateway: [MemoryPool] Sending 'TransmissionRequest' to '127.0.0.1:5004'
    2024-07-03T12:59:12.405011Z TRACE snarkos_node_bft::gateway: [MemoryPool] Sending 'TransmissionRequest' to '127.0.0.1:5004'
    2024-07-03T12:59:12.407856Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'TransmissionResponse' from '127.0.0.1:5004'
    2024-07-03T12:59:12.407865Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'TransmissionResponse' from '127.0.0.1:5004'
    2024-07-03T12:59:12.407904Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'TransmissionResponse' from '127.0.0.1:5004'
    2024-07-03T12:59:12.407910Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'TransmissionResponse' from '127.0.0.1:5004'
    2024-07-03T12:59:12.407915Z TRACE snarkos_node_bft::gateway: [MemoryPool] Received 'TransmissionResponse' from '127.0.0.1:5004'
  4. Honest validator receive fake solutions from malicious nodes and add these solutions to worker.ready

Malicious validator continuously forge and broadcast fake solutions through Worker Ping, and honest validators continuously request fake solutions and add them to worker.ready.

Part two:

  1. Honest validator executes propose_batch and collects transmission from worker.ready
  2. Honest validator spawn a blocking thread to check solution https://github.com/AleoNet/snarkOS/blob/878624d6ccab6dfeb52f69ce54ee885464cdf7d8/node/bft/ledger-service/src/ledger.rs#L284-L289
  3. Since worker.ready is constantly filled with fake solutions, it is difficult for honest validators to collect enough legitimate transactions, so honest validators are temporary stuck in propose_batch (Especially when the puzzle is extremely difficult to calculate)
    2024-07-03T08:27:07.765021Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1vd8l6kw..' - Invalid solution 'solution1vd8l6kw..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:09.543961Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1x3qa74a..' - Invalid solution 'solution1x3qa74a..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:09.647375Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1gvcdnxq..' - Invalid solution 'solution1gvcdnxq..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:09.671640Z TRACE snarkos_node_bft::primary: Skipping batch proposal (node is already proposing)
    2024-07-03T08:27:11.503260Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1h8wq6va..' - Invalid solution 'solution1h8wq6va..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:12.172546Z TRACE snarkos_node_bft::primary: Skipping batch proposal (node is already proposing)
    2024-07-03T08:27:13.496543Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1tgtlmwc..' - Invalid solution 'solution1tgtlmwc..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:14.673676Z TRACE snarkos_node_bft::primary: Skipping batch proposal (node is already proposing)
    2024-07-03T08:27:15.433994Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1jtvd4xu..' - Invalid solution 'solution1jtvd4xu..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:17.162770Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1tathxv0..' - Invalid solution 'solution1tathxv0..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:17.175050Z TRACE snarkos_node_bft::primary: Skipping batch proposal (node is already proposing)
    2024-07-03T08:27:19.049933Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1qn0uaxq..' - Invalid solution 'solution1qn0uaxq..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:19.676232Z TRACE snarkos_node_bft::primary: Skipping batch proposal (node is already proposing)
    2024-07-03T08:27:20.835085Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1wyfeq80..' - Invalid solution 'solution1wyfeq80..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:22.177421Z TRACE snarkos_node_bft::primary: Skipping batch proposal (node is already proposing)
    2024-07-03T08:27:22.967898Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1mawk70w..' - Invalid solution 'solution1mawk70w..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:24.678461Z TRACE snarkos_node_bft::primary: Skipping batch proposal (node is already proposing)
    2024-07-03T08:27:25.074979Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution1cwa4aww..' - Invalid solution 'solution1cwa4aww..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:27.049210Z TRACE snarkos_node_bft::primary: Proposing - Skipping solution 'solution192k8k48..' - Invalid solution 'solution192k8k48..' for the current epoch - The proof target does not match the expected proof target
    2024-07-03T08:27:27.180037Z TRACE snarkos_node_bft::primary: Skipping batch proposal (node is already proposing)

All honest validators will temporarily stuck in the execution of propose_batch, making it difficult to reach consensus.

2024-07-03T08:02:50.167158Z  INFO snarkos_node_bft_ledger_service::ledger: 
Advanced to block 78717 at round 158132 - ab13kvke2uamsmteg8cev40jvlkwe5aq9npgkl5pg6pzjt6e3qlmyqqpjkmes

2024-07-03T08:06:15.183405Z  INFO snarkos_node_bft_ledger_service::ledger: 
Advanced to block 78718 at round 158134 - ab15yfynted80000xu7vxyu8t0sg0qg7dknxuxyrydtw0500lamgqys5qyzkz

2024-07-03T08:12:09.407371Z  INFO snarkos_node_bft_ledger_service::ledger: 
Advanced to block 78719 at round 158136 - ab1d7jjthkrgsld7wf7e8r7jcpdsszs2jj2pwg2u5pe2p8xfrsakgpq6fgjea

2024-07-03T08:14:39.888302Z  INFO snarkos_node_bft_ledger_service::ledger: 
Advanced to block 78720 at round 158138 - ab1mcdzdhhd57uxsyk0pehlvclx23smpejlf9j30f95elylhj67c5xs5yayxn

2024-07-03T08:29:27.740284Z  INFO snarkos_node_bft_ledger_service::ledger: 
Advanced to block 78721 at round 158146 - ab1696d5wk6ugrld6gcrw5nww8qck8jdlzygf0qlhjktqv34vhz7cysx8lgh2

2024-07-03T08:31:58.727823Z  INFO snarkos_node_bft_ledger_service::ledger: 
Advanced to block 78722 at round 158148 - ab18xf8sy9ae448r0gpwzmpszpsh0j3fky5yrk8kfy2twlfdjm5gvrq495zjt

2024-07-03T08:36:31.712025Z  INFO snarkos_node_bft_ledger_service::ledger: 
Advanced to block 78723 at round 158150 - ab1pm07f8rkl53dh50d3kmx7ra57yx95x0mfdug3zc4l9wrdzarvgrsd8kl3x

2024-07-03T08:39:43.626391Z  INFO snarkos_node_bft_ledger_service::ledger: 
Advanced to block 78724 at round 158152 - ab1tqtrazen409esgd2dujtvdqd3jytkzxqr8pt50ve2g5ctrsttcgsyqunv3

2024-07-03T08:43:25.927930Z  INFO snarkos_node_bft_ledger_service::ledger: 
Advanced to block 78725 at round 158154 - ab1y7d76wy2uxu3dp8h29udmqsfgwmfy0rjtn4dyara087pu4jm7gzsj29zkx

Part three:

  1. Since worker.ready is always full of fake solutions, honest validators cannot process solutions from peers https://github.com/AleoNet/snarkOS/blob/878624d6ccab6dfeb52f69ce54ee885464cdf7d8/node/consensus/src/lib.rs#L307-L314

Impact: This type of attack will lead to two results:

  1. Since honest validators are often and long-term stuck in the execution of propose_batch, network performance is significantly reduced
  2. Since the honest validator's worker.ready is full of fake solutions, it is impossible to process valid solutions in the network, which destroys the PoW part of the consensus and damages the Prover's income

Your Environment

snarkOS Version: 878624d6ccab6dfeb52f69ce54ee885464cdf7d8

HarukaMa commented 5 months ago

Not sure if we are able to tell which solutions are "fake" without calculating their targets. As your POC shows we can just spam solutions with random counters. Hopefully the real puzzle will be shorter per run to make this relatively a non-issue.

elderhammer commented 5 months ago

Not sure if we are able to tell which solutions are "fake" without calculating their targets. As your POC shows we can just spam solutions with random counters. Hopefully the real puzzle will be shorter per run to make this relatively a non-issue.

Yes, constructing puzzles is easy, but verifying them is time-consuming.

HarukaMa commented 5 months ago

was thinking about this one and forgot to comment. So apparently this is still a concern considering all nodes will try to validate all of the incoming unconfirmed solutions and we can't really do that fast enough. We probably need some mechanisms to block peers if too many invalid solutions are seen.

raychu86 commented 4 months ago

This is a known issue, but thank you for flagging this again.

Our current recommendations for mitigations is for Validators to run their own "Core Clients" which acts as the filtering mechanism for valid solutions/transactions that get sent to the validator.

Validators should not be arbitrarily exposed to external clients or REST request that may be malicious; the clients should be the ones interfacing externally and becoming the line of defense for the validators. With this approach, the validators should in theory be protected from these types of attacks.

elderhammer commented 4 months ago

Our current recommendations for mitigations is for Validators to run their own "Core Clients" which acts as the filtering mechanism for valid solutions/transactions that get sent to the validator.

The method you mentioned can filter attacks from P2P, but it seems that it cannot filter attacks from BFT, because the validator directly requests fake solution from the malicious verifier through TransmissionRequest.

vicsn commented 4 months ago

The method you mentioned can filter attacks from P2P, but it seems that it cannot filter attacks from BFT, because the validator directly requests fake solution from the malicious verifier through TransmissionRequest.

Yes that's correct. Post-launch, the community should consider requiring a small Fee for solutions, and looking into batch verification to further reduce the attack vector.