Open FurkanKarakas opened 4 years ago
EDIT: I added some more information about the block size. I also added two more test scenarios. I can extend this test idea if we think we should: https://github.com/FurkanKarakas/tendermint/blob/release/v0.33.1/Furkan/Reports/Semester_Project_Report.pdf
I have a few comments:
It's nice to have the rendered .pdf and look directly at that for an overview. But for understanding what exactly is "new" in the report, it would be useful to have the .tex source file; in that file I'd be able to see exactly which are new lines you added with each commit, so the reader know what's new & relevant.
I looked at the block_size.go
script and there are a few ways you can improve it in important ways.
The script is a bit difficult to understand. One issue is that some structures have confusing names (Foo
here). It would be great to have consistent and correct names in your scripts.
Also, like you mentioned in your report, when we talk about a block size in bytes this is not a direct multiplication of the number of transactions. So instead of size in bytes it would be more valuable if your script block_size.go
reported size in number of transactions.
The script works with a fixed height, but it would be more valuable if we can extract the block size for all heights.
If you feel like these are important improvements, I suggest you open a separate issue to track these problems with the block_size.go
script.
For 3. in order to simulate long-delay network test I changed configuration files so that I do not get timeout reached error. It might be due to this reason that all TXs are bundled in a single block when I do long network delay simulations. This is not the case when I do a simulation with low network delay. For the other points -- I will fix them in a future commit.
EDIT: tex file can now be found at: https://github.com/FurkanKarakas/tendermint/blob/release/v0.33.1/Furkan/Reports/project_report.tex It can be accessed alongside with the pdf file.
EDIT 2: Now I can print the block size according to the number of transactions present inside.
UPDATE: After I upgrading to Ubuntu 20.04 LTS there seems to be a problem with the original docker installation and it prevents me from running tendermint localnode cluster. For this reason I set up a virtual machine again in Ubuntu 18.04 LTS and I was planning to keep working on this VM. Do you think it is fine? Let me know your opinion.
I don't see any problems with that. Performance might be affected, but I'm not even sure about that, since current container systems are quite impressive. So your suggestion seems good!
UPDATE: In order to simulate Byzantine behavior in Tendermint, I was advised to have two nodes sign the blocks with the same private key. To achieve this, I copied the file priv_val_key.json
file of node0 located in build/node0/config/
into the folder build/node1/config/
, hence replacing the file of node1 with node0. By this, I wanted to make it look like node1 is signing messages in the name of node0. However, I get the following error if I do that: Found conflicting vote from ourselves. Did you unsafe_reset a validator?
. I think in that case two nodes are equally powerful, i.e. there is no distinction between Byzantine and honest nodes.
I think the message Found conflicting vote from ourselves. Did you unsafe_reset a validator?
is a sign that you have succeeded to actually simulate Byzantine behavior. However, in some of my experiments, I was getting this message also whenever the network was not set-up properly (e.g., incorrect validator restart, or incorrect validator configuration).
Is it possible to investigate the message/error logs on validators? Ideally, we have to convince ourselves that the attack worked -- and what was the sequence of protocol messages that the validators sent, namely what are exactly the conflicting votes
. Once we find those votes (in the logs) I think we can reasonably conclude you succeeded this adversarial attack and the network stopped.
UPDATE: I tried something else. I modified the private key of the Byzantine node and gave it the public key of an honest node. After running the setup now I get authentication error. In that case I think the Byzantine node cannot sign with the correct private key since he shouldn't be able to know the private key of the trustworthy node. However, again in this test, I do not observe that the Byzantine node gets jailed (it should be jailed as long as I understand from the specifications of Tendermint).
Can you explain what you mean by
I do not observe that the Byzantine node gets jailed
Are you looking at the logs? As I mentioned in my previous comment, the only way to find what is really happening is to look at the logs. It would be great if you can do that. Otherwise I don't understand what you mean by "jailed".
As long as I understand from the documentation of Tendermint, if some validating node performs Byzantine behavior then it gets 'jailed' or banned from participating in the algorithm for some time. There are some requirements to jailbreak but I am not sure about the details.
Here is the log file of the Byzantine node (node1
that is masquerading node0
by adopting its public key):
I[2020-05-12|00:51:51.476] Version info module=main software=0.33.1 block=10 p2p=7 I[2020-05-12|00:51:51.485] Starting Node module=main impl=Node I[2020-05-12|00:51:51.489] Started node module=main nodeInfo="{ProtocolVersion:{P2P:7 Block:10 App:1} DefaultNodeID:0827b304e8e43dcaddc57ec69b28126aa593088f ListenAddr:tcp://0.0.0.0:26656 Network:chain-myTPp4 Version:0.33.1 Channels:4020212223303800 Moniker:D979AE3C057F3D53 Other:{TxIndex:on RPCAddress:tcp://0.0.0.0:26657}}" E[2020-05-12|00:51:51.491] dialing failed (attempts: 1): auth failure: handshake failed: EOF module=pex addr=2fa6b995be63c1a2f7192eee443c363a770cd5ad@192.167.10.2:26656 E[2020-05-12|00:51:51.492] dialing failed (attempts: 1): auth failure: handshake failed: EOF module=pex addr=072cfb17151eb4af542d5979d472d098a9f0556b@192.167.10.4:26656 E[2020-05-12|00:51:51.493] dialing failed (attempts: 1): auth failure: secret conn failed: challenge verification failed module=pex addr=0a57cbfccf08520d6411a059ada8e93385c5fe10@192.167.10.3:26656 E[2020-05-12|00:51:51.494] dialing failed (attempts: 1): auth failure: handshake failed: EOF module=pex addr=917f1592559aaaf4b206f9d2b17a6cf173b11706@192.167.10.6:26656 E[2020-05-12|00:51:51.495] dialing failed (attempts: 1): auth failure: handshake failed: EOF module=pex addr=818f50f9f1b8fe47009e96550a9cbb820111ddf8@192.167.10.5:26656 E[2020-05-12|00:51:51.907] Error dialing peer module=p2p err="auth failure: handshake failed: EOF" E[2020-05-12|00:51:51.992] Error dialing peer module=p2p err="auth failure: handshake failed: EOF" E[2020-05-12|00:51:52.992] Error dialing peer module=p2p err="auth failure: handshake failed: EOF" E[2020-05-12|00:51:53.132] Error dialing peer module=p2p err="auth failure: secret conn failed: challenge verification failed" E[2020-05-12|00:51:53.591] Error dialing peer module=p2p err="auth failure: handshake failed: EOF" E[2020-05-12|00:52:21.492] dialing failed (attempts: 2): auth failure: handshake failed: EOF module=pex addr=2fa6b995be63c1a2f7192eee443c363a770cd5ad@192.167.10.2:26656 E[2020-05-12|00:52:21.493] dialing failed (attempts: 2): auth failure: secret conn failed: challenge verification failed module=pex addr=0a57cbfccf08520d6411a059ada8e93385c5fe10@192.167.10.3:26656 E[2020-05-12|00:52:21.493] dialing failed (attempts: 2): auth failure: handshake failed: EOF module=pex addr=072cfb17151eb4af542d5979d472d098a9f0556b@192.167.10.4:26656 E[2020-05-12|00:52:21.493] dialing failed (attempts: 2): auth failure: handshake failed: EOF module=pex addr=818f50f9f1b8fe47009e96550a9cbb820111ddf8@192.167.10.5:26656 E[2020-05-12|00:52:21.494] dialing failed (attempts: 2): auth failure: handshake failed: EOF module=pex addr=917f1592559aaaf4b206f9d2b17a6cf173b11706@192.167.10.6:26656 E[2020-05-12|00:52:51.511] dialing failed (attempts: 3): auth failure: secret conn failed: challenge verification failed module=pex addr=0a57cbfccf08520d6411a059ada8e93385c5fe10@192.167.10.3:26656 E[2020-05-12|00:52:51.511] dialing failed (attempts: 3): auth failure: handshake failed: EOF module=pex addr=072cfb17151eb4af542d5979d472d098a9f0556b@192.167.10.4:26656 E[2020-05-12|00:52:51.511] dialing failed (attempts: 3): auth failure: handshake failed: EOF module=pex addr=2fa6b995be63c1a2f7192eee443c363a770cd5ad@192.167.10.2:26656 E[2020-05-12|00:52:51.512] dialing failed (attempts: 3): auth failure: handshake failed: EOF module=pex addr=917f1592559aaaf4b206f9d2b17a6cf173b11706@192.167.10.6:26656 E[2020-05-12|00:52:51.512] dialing failed (attempts: 3): auth failure: handshake failed: EOF module=pex addr=818f50f9f1b8fe47009e96550a9cbb820111ddf8@192.167.10.5:26656
Most of it seems like handshake failed, i.e. authentication errors. So what I actually simulated here is that the authentication does not work, which is actually a Byzantine behavior in a sense. So I think I performed Byzantine action but I really hoped to see that node1
would have got jailed.