Progression with the project (21 Apr 2020)

FurkanKarakas commented 4 years ago

It is important to see how large blocks are instead of the size of transactions. I can add a paragraph in the report for this.
I will hard-code the tendermint codebase, one of the validators is going to send a message that is not supposed to send. The different phases are PREVOTE, PRECOMMIT, etc. We will send hardcoded values for the different phases. I need the following steps:
- Identify the different phases in the code base, where they are etc.
- Hardcode some stuff: if I am validator ID node0, then do this... (perform Byzantine behaviour)

FurkanKarakas commented 4 years ago

EDIT: I added some more information about the block size. I also added two more test scenarios. I can extend this test idea if we think we should: https://github.com/FurkanKarakas/tendermint/blob/release/v0.33.1/Furkan/Reports/Semester_Project_Report.pdf

adizere commented 4 years ago

I have a few comments:

Can you please add the sources of your .pdf file to the repo? The .tex should be enough (no need for .aux or others file.)

It's nice to have the rendered .pdf and look directly at that for an overview. But for understanding what exactly is "new" in the report, it would be useful to have the .tex source file; in that file I'd be able to see exactly which are new lines you added with each commit, so the reader know what's new & relevant.

I looked at the block_size.go script and there are a few ways you can improve it in important ways.
- The script is a bit difficult to understand. One issue is that some structures have confusing names (Foo here). It would be great to have consistent and correct names in your scripts.
- Also, like you mentioned in your report, when we talk about a block size in bytes this is not a direct multiplication of the number of transactions. So instead of size in bytes it would be more valuable if your script block_size.go reported size in number of transactions.
- The script works with a fixed height, but it would be more valuable if we can extract the block size for all heights.

If you feel like these are important improvements, I suggest you open a separate issue to track these problems with the block_size.go script.

I did not understand your remark that "all transactions are bundled in a single block". Is that true? This means that the corresponding execution had a single "useful" block and that's a problem...

FurkanKarakas commented 4 years ago

For 3. in order to simulate long-delay network test I changed configuration files so that I do not get timeout reached error. It might be due to this reason that all TXs are bundled in a single block when I do long network delay simulations. This is not the case when I do a simulation with low network delay. For the other points -- I will fix them in a future commit.

FurkanKarakas commented 4 years ago

EDIT: tex file can now be found at: https://github.com/FurkanKarakas/tendermint/blob/release/v0.33.1/Furkan/Reports/project_report.tex It can be accessed alongside with the pdf file.

EDIT 2: Now I can print the block size according to the number of transactions present inside.

FurkanKarakas commented 4 years ago

UPDATE: After I upgrading to Ubuntu 20.04 LTS there seems to be a problem with the original docker installation and it prevents me from running tendermint localnode cluster. For this reason I set up a virtual machine again in Ubuntu 18.04 LTS and I was planning to keep working on this VM. Do you think it is fine? Let me know your opinion.

adizere commented 4 years ago

I don't see any problems with that. Performance might be affected, but I'm not even sure about that, since current container systems are quite impressive. So your suggestion seems good!

FurkanKarakas commented 4 years ago

UPDATE: In order to simulate Byzantine behavior in Tendermint, I was advised to have two nodes sign the blocks with the same private key. To achieve this, I copied the file priv_val_key.json file of node0 located in build/node0/config/ into the folder build/node1/config/, hence replacing the file of node1 with node0. By this, I wanted to make it look like node1 is signing messages in the name of node0. However, I get the following error if I do that: Found conflicting vote from ourselves. Did you unsafe_reset a validator?. I think in that case two nodes are equally powerful, i.e. there is no distinction between Byzantine and honest nodes.

adizere commented 4 years ago

I think the message Found conflicting vote from ourselves. Did you unsafe_reset a validator? is a sign that you have succeeded to actually simulate Byzantine behavior. However, in some of my experiments, I was getting this message also whenever the network was not set-up properly (e.g., incorrect validator restart, or incorrect validator configuration).

Is it possible to investigate the message/error logs on validators? Ideally, we have to convince ourselves that the attack worked -- and what was the sequence of protocol messages that the validators sent, namely what are exactly the conflicting votes. Once we find those votes (in the logs) I think we can reasonably conclude you succeeded this adversarial attack and the network stopped.

FurkanKarakas commented 4 years ago

UPDATE: I tried something else. I modified the private key of the Byzantine node and gave it the public key of an honest node. After running the setup now I get authentication error. In that case I think the Byzantine node cannot sign with the correct private key since he shouldn't be able to know the private key of the trustworthy node. However, again in this test, I do not observe that the Byzantine node gets jailed (it should be jailed as long as I understand from the specifications of Tendermint).

adizere commented 4 years ago

Can you explain what you mean by

I do not observe that the Byzantine node gets jailed

Are you looking at the logs? As I mentioned in my previous comment, the only way to find what is really happening is to look at the logs. It would be great if you can do that. Otherwise I don't understand what you mean by "jailed".

FurkanKarakas commented 4 years ago

As long as I understand from the documentation of Tendermint, if some validating node performs Byzantine behavior then it gets 'jailed' or banned from participating in the algorithm for some time. There are some requirements to jailbreak but I am not sure about the details. Here is the log file of the Byzantine node (node1 that is masquerading node0 by adopting its public key): I[2020-05-12|00:51:51.476] Version info module=main software=0.33.1 block=10 p2p=7 I[2020-05-12|00:51:51.485] Starting Node module=main impl=Node I[2020-05-12|00:51:51.489] Started node module=main nodeInfo="{ProtocolVersion:{P2P:7 Block:10 App:1} DefaultNodeID:0827b304e8e43dcaddc57ec69b28126aa593088f ListenAddr:tcp://0.0.0.0:26656 Network:chain-myTPp4 Version:0.33.1 Channels:4020212223303800 Moniker:D979AE3C057F3D53 Other:{TxIndex:on RPCAddress:tcp://0.0.0.0:26657}}" E[2020-05-12|00:51:51.491] dialing failed (attempts: 1): auth failure: handshake failed: EOF module=pex addr=2fa6b995be63c1a2f7192eee443c363a770cd5ad@192.167.10.2:26656 E[2020-05-12|00:51:51.492] dialing failed (attempts: 1): auth failure: handshake failed: EOF module=pex addr=072cfb17151eb4af542d5979d472d098a9f0556b@192.167.10.4:26656 E[2020-05-12|00:51:51.493] dialing failed (attempts: 1): auth failure: secret conn failed: challenge verification failed module=pex addr=0a57cbfccf08520d6411a059ada8e93385c5fe10@192.167.10.3:26656 E[2020-05-12|00:51:51.494] dialing failed (attempts: 1): auth failure: handshake failed: EOF module=pex addr=917f1592559aaaf4b206f9d2b17a6cf173b11706@192.167.10.6:26656 E[2020-05-12|00:51:51.495] dialing failed (attempts: 1): auth failure: handshake failed: EOF module=pex addr=818f50f9f1b8fe47009e96550a9cbb820111ddf8@192.167.10.5:26656 E[2020-05-12|00:51:51.907] Error dialing peer module=p2p err="auth failure: handshake failed: EOF" E[2020-05-12|00:51:51.992] Error dialing peer module=p2p err="auth failure: handshake failed: EOF" E[2020-05-12|00:51:52.992] Error dialing peer module=p2p err="auth failure: handshake failed: EOF" E[2020-05-12|00:51:53.132] Error dialing peer module=p2p err="auth failure: secret conn failed: challenge verification failed" E[2020-05-12|00:51:53.591] Error dialing peer module=p2p err="auth failure: handshake failed: EOF" E[2020-05-12|00:52:21.492] dialing failed (attempts: 2): auth failure: handshake failed: EOF module=pex addr=2fa6b995be63c1a2f7192eee443c363a770cd5ad@192.167.10.2:26656 E[2020-05-12|00:52:21.493] dialing failed (attempts: 2): auth failure: secret conn failed: challenge verification failed module=pex addr=0a57cbfccf08520d6411a059ada8e93385c5fe10@192.167.10.3:26656 E[2020-05-12|00:52:21.493] dialing failed (attempts: 2): auth failure: handshake failed: EOF module=pex addr=072cfb17151eb4af542d5979d472d098a9f0556b@192.167.10.4:26656 E[2020-05-12|00:52:21.493] dialing failed (attempts: 2): auth failure: handshake failed: EOF module=pex addr=818f50f9f1b8fe47009e96550a9cbb820111ddf8@192.167.10.5:26656 E[2020-05-12|00:52:21.494] dialing failed (attempts: 2): auth failure: handshake failed: EOF module=pex addr=917f1592559aaaf4b206f9d2b17a6cf173b11706@192.167.10.6:26656 E[2020-05-12|00:52:51.511] dialing failed (attempts: 3): auth failure: secret conn failed: challenge verification failed module=pex addr=0a57cbfccf08520d6411a059ada8e93385c5fe10@192.167.10.3:26656 E[2020-05-12|00:52:51.511] dialing failed (attempts: 3): auth failure: handshake failed: EOF module=pex addr=072cfb17151eb4af542d5979d472d098a9f0556b@192.167.10.4:26656 E[2020-05-12|00:52:51.511] dialing failed (attempts: 3): auth failure: handshake failed: EOF module=pex addr=2fa6b995be63c1a2f7192eee443c363a770cd5ad@192.167.10.2:26656 E[2020-05-12|00:52:51.512] dialing failed (attempts: 3): auth failure: handshake failed: EOF module=pex addr=917f1592559aaaf4b206f9d2b17a6cf173b11706@192.167.10.6:26656 E[2020-05-12|00:52:51.512] dialing failed (attempts: 3): auth failure: handshake failed: EOF module=pex addr=818f50f9f1b8fe47009e96550a9cbb820111ddf8@192.167.10.5:26656 Most of it seems like handshake failed, i.e. authentication errors. So what I actually simulated here is that the authentication does not work, which is actually a Byzantine behavior in a sense. So I think I performed Byzantine action but I really hoped to see that node1 would have got jailed.

FurkanKarakas / tendermint

Progression with the project (21 Apr 2020) #4