relab / hotstuff

MIT License
167 stars 53 forks source link

several problems in hotstuff run #33

Closed SenjiMuramasa closed 2 years ago

SenjiMuramasa commented 2 years ago

Hi, I recently started to learn about Hotstuff related content, and I tried to run your code. But I ran into a few problems running the code. I will show them in the pictures. I didn't make any changes when I ran the code, I just did the following steps: 1 make 2 ./hotstuff run But the logs seem to indicate something unusual. Do these logs represent any problems in my operation? Or are these logs normal? Thank you very much. 图片1 图片2 图片3 图片4

johningve commented 2 years ago

Hi!

None of these log messages indicate any serious issues. It's likely that they were all caused by messages arriving in a different order than expected. That can happen often, especially in a low-latency network.

I'll take a closer look at the sources of these log messages and consider if they should be reclassified as "debug" messages.

SenjiMuramasa commented 2 years ago

Thank you for your answer! I am trying to do some experiments with your code, so I am wondering if you tested the performance of the code? In addition, I see some content about fork in byzantine.go. Can fork be used directly? And how to use it? I am new to hotstuff and Go, so my question may be low-level. I am very sorry to bother you. Thank you very much.

johningve commented 2 years ago

I am trying to do some experiments with your code, so I am wondering if you tested the performance of the code?

We did some comparisons with libhotstuff a long time ago, but I haven't looked at the performance in a while.

In addition, I see some content about fork in byzantine.go. Can fork be used directly? And how to use it?

Uhh, I just noticed that it doesn't work on master. 😅I've pushed a commit to fix it! You can enable it using the byzantine flag:

./hotstuff run --byzantine="fork:1"

The example above enables the "fork" behavior for a single replica. You could combine several byzantine behaviors also:

./hotstuff run --byzantine="fork:1,silence:2"

This would enable the "fork" behavior for one replica, and the "silence" behavior for two replicas.

SenjiMuramasa commented 2 years ago

Thank you very much!

I have tried using a fork attack but the code is stuck. fork So I wonder if this is a sign of a success attack? Should this trigger the timeout system to go to the next view?

And, I want to try using Batch to improve the system throughput. I changed batch-size in the run.go file, but the result was 0. batch1 batch2 Are there any other configuration changes that need to be made?

Again, thank you for your patient reply.

johningve commented 2 years ago

It looks like the default setting for max-concurrent, which controls the number of concurrent commands a client can issue, is too low for a fork scenario. This setting must be changed if you change the batch size. It should be at least 4 times the batch size if you are using a single client.

Going back to the fork attack, by increasing max-concurrent from 4 (the default) to 10, I get this result:

executed: 4900, failed: 1634

The failed commands shows that the fork attack was successful.

Now, you can compare with the fasthotstuff implementation:

./hotstuff run --consensus="fasthotstuff" --byzantine="fork:1" --max-concurrent=10 --timeout-multiplier=1

(setting --timeout-multiplier to 1 here stops the view-synchronizer from increasing the view duration when a timeout occurs)

This gives me the result: executed: 1510, failed: 0, showing that the fasthotstuff implementation is indeed resistant to forking attacks :)

SenjiMuramasa commented 2 years ago

Thank you! It helps a lot. After increasing 'max-concurrent' from 4 to 10, the code started to run. But it seems the code can only run for a short time. I tried to continue increasing the 'max-concurrent' and it worked a little better. But it still doesn't run to the end. (I increased 'duration' to 60 seconds) fork I'm using a virtual machine to run the code on Linux21. Is that the cause of the problem?

SenjiMuramasa commented 2 years ago

Hi, I tried to run the code many times, sometimes the code would run longer, and sometimes it would just get stuck. According to the number of final executions and failures, the attack was successful.

These two images represent two typical cases (duration=10s, max-concurrent=10). The first one stuck as soon as it started running. 2

The second one ran almost to the end of the duration. 1

Do you have this kind of problem? Or was it just me, for some other reason?

johningve commented 2 years ago

Sorry for the slow response. It looks to me like this is the client getting stuck due to some unacknowledged commands (or maybe unordered?). I am not quite sure why it happens. To remedy this, I have added a timeout to the client that is configurable through the --client-timeout flag (defaults to 500ms).

SenjiMuramasa commented 2 years ago

Thank you for your patience. It always takes time to deal with problems. In my test, I found that the code kept returning from the consensus.go file as shown in the image below when it was stuck using a fork attack. In the Get() function, Select goes all the way to case <- ctx.Done(). 1 2 My Go language level is not good enough to solve this problem. I just hope this little discovery will help.

Thanks again for your reply. You really helped me a lot.

In addition, I just started to use Github, so I did not deal with some contents that needed to be bold or marked properly. Please forgive me.

SenjiMuramasa commented 2 years ago

Hello, here I go again~~ I tried overwriting two blocks with a fork attack, but it seems the malicious block didn't get into the chain. Where can I modify so that the malicious blocks can be saved in the chain? Thank you!

johningve commented 2 years ago

It's hard for me to tell exactly why the malicious block is not accepted without any more detail, but here is how it normally works:

  1. One of the replicas creates the new block and proposes it to the other replicas. This is usually done by the Consensus.Propose method. However, the ProposeRuler interface allows the behavior of this to be changed, which is how the byzantine package works.
  2. All replicas then process the block in the OnPropose function. This function verifies the block's quorum certificate, checks the identity of the proposer, and then checks the VoteRule of the consensus protocol. If all these checks pass, the command(s) stored in the block are inspected by the Acceptor module. If the Acceptor module accepts the command(s), then the block is accepted. Currently, the Acceptor module checks the serial numbers of commands to ensure that only new commands are accepted. This is implemented in the file replica/cmdcache.go
  3. Depending on the consensus protocol, the block will be committed into the chain when its grandchild or great-grandchild is proposed (assuming that all blocks in between are accepted too).

PS: I'm converting this issue into a discussion, as I think that is more appropriate ;)