obcpbft breaks under stress

gongsu832 commented 8 years ago

Hi, I was doing some performance tests with obcpbft enabled. I have 4 peers, vp0-vp3, running on 4 z/VM zLinux guests. Security is disabled. I used jmeter to do the tests. The elements of the jmeter thread group are:

deploy (only once)
invoke
query

The deployed the chaincode is chaincode_example02 used in the github docs with the following REST call:

{
    "type": "GOLANG",
    "chaincodeID":{
        "path":"github.com/openblockchain/obc-peer/openchain/example/chaincode/chaincode_example02"
    },
    "ctorMsg": {
        "function":"init",
        "args":["a", "100", "b", "200"]
    }
}

The invoke REST call is:

{
  "chaincodeSpec":{
      "type": "GOLANG",
      "chaincodeID":{
      "name":"bb540edfc1ee2ac0f5e2ec6000677f4cd1c6728046d5e32dede7fea11a42f86a6943b76a8f9154f4792032551ed320871ff7b7076047e4184292e01e3421889c"
      },
      "ctorMsg":{
          "function":"invoke",
          "args":["a", "b", "10"]
      }
  }
}

And the query REST call is:

{
  "chaincodeSpec":{
      "type": "GOLANG",
      "chaincodeID":{
      "name":"bb540edfc1ee2ac0f5e2ec6000677f4cd1c6728046d5e32dede7fea11a42f86a6943b76a8f9154f4792032551ed320871ff7b7076047e4184292e01e3421889c"
      },
      "ctorMsg":{
          "function":"query",
          "args":["b"]
      }
  }
}

I set the loop to 5000 times. All the REST calls were sent to vp0. At some point, vp0 goes into some kind of error state and continuously print out the following in the log forever, even after the client has stopped sending requests:

15:19:38.228 [consensus/statetransfer] StateTransferState -> ERRO 1c58c name:"vp0"  could not retrieve state as recent as 0 from any of specified peers
15:19:38.314 [consensus/statetransfer] StateTransferState -> WARN 1c58d name:"vp0"  in tryOverPeers loop trying name:"vp1"  : name:"vp0"  received a corrupt delta from name:"vp1"  : Error unmarashaling size: unexpected EOF
15:19:38.374 [consensus/statetransfer] StateTransferState -> WARN 1c58e name:"vp0"  in tryOverPeers loop trying name:"vp2"  : name:"vp0"  received a corrupt delta from name:"vp2"  : Error unmarashaling size: unexpected EOF
15:19:38.453 [consensus/statetransfer] StateTransferState -> WARN 1c58f name:"vp0"  in tryOverPeers loop trying name:"vp3"  : name:"vp0"  received a corrupt delta from name:"vp3"  : Error unmarashaling size: unexpected EOF

On vp1-3, the log shows something like the following:

15:02:12.782 [consensus/obcpbft] pbftCore -> INFO 1c2f Replica 2 sending view-change, v:13, h:5950, |C|:1, |P|:1, |Q|:1
15:02:12.782 [consensus/obcpbft] pbftCore -> INFO 1c30 Replica 2 received view-change from replica 2, v:13, h:5950, |C|:1, |P|:1, |Q|:1
15:02:12.782 [consensus/obcpbft] pbftCore -> INFO 1c31 Replica 2 received view-change from replica 3, v:13, h:5950, |C|:1, |P|:1, |Q|:1
15:02:12.782 [consensus/obcpbft] pbftCore -> INFO 1c32 Replica 2 received view-change from replica 1, v:13, h:5950, |C|:1, |P|:1, |Q|:1
15:02:28.782 [consensus/obcpbft] pbftCore -> INFO 1c33 Replica 2 view change timer expired, sending view change
15:02:28.783 [consensus/obcpbft] pbftCore -> INFO 1c34 Replica 2 sending view-change, v:14, h:5950, |C|:1, |P|:1, |Q|:1
15:02:28.783 [consensus/obcpbft] pbftCore -> INFO 1c35 Replica 2 received view-change from replica 2, v:14, h:5950, |C|:1, |P|:1, |Q|:1
15:02:28.783 [consensus/obcpbft] pbftCore -> INFO 1c36 Replica 2 received view-change from replica 3, v:14, h:5950, |C|:1, |P|:1, |Q|:1
15:02:28.783 [consensus/obcpbft] pbftCore -> INFO 1c37 Replica 2 received view-change from replica 1, v:14, h:5950, |C|:1, |P|:1, |Q|:1

tuand27613 commented 8 years ago

This looks to be the same as #680 ?

Which commit are you running ?

gongsu832 commented 8 years ago

Commit tag d0dd6cc1393c224f3c87f31c52aff232a03027ea on Feb. 18.

jyellick commented 8 years ago

@gongsu832 Yes, the bug you are seeing was fixed a few commits after the level you are running at. Please see the discussion of issue #680 which @tuand27613 referenced and attempt with a commit beyond the merge request which fixed that issue.

gongsu832 commented 8 years ago

OK I'll try to rebase my fork to include that fix. Thanks!

gongsu832 commented 8 years ago

I rebased to commit tag 3c5fbcdf8a288b7d5ed5aa0dae5d570472169e73 and did some tests. It appears to have fixed the problem. But I'm seeing another strange behavior which I don't know whether it's expected or not. Basically, after deploying the chaincode, I send in 10K invoke requests. I can see that on the primary validator, it prints out a lot of

14:53:49.496 [rest] ServerOpenchainREST -> INFO 1913 REST invoking chaincode...
14:53:49.499 [consensus/obcpbft] obcClassic -> INFO 1914 New consensus request received
14:53:49.500 [rest] ServerOpenchainREST -> INFO 1915 Successfuly invoked chainCode with txuuid (f60dc52d-b05c-4a91-91d2-42862240f3f1)
14:53:49.500 [consensus/obcpbft] pbftCore -> INFO 1916 Replica 0 executing/committing request for view=0/seqNo=1580 and digest gP7349B9SpAWtVTKVEJUMo8H1zPiGw8KmJZiPY/r0+lts0JyLRAqubeP5s3IlFmSzd7GXjO1TEZ2qsYVaTGkSQ==

And on the replicating validators, they are printing out a lot of

14:54:31.683 [consensus/obcpbft] pbftCore -> INFO 278b Replica 2 executing/committing request for view=0/seqNo=9889 and digest XX5X2ymalfqAmxkX9Vpl6LTPUoojesB6Ja90WSvFzumvmaiZcw3F4GQBUD1Xg+ajbyDBoB3KsvBlGgM5g5uySA==

After all the requests are done, all the validators print out some messages about sending/receiving view-change. Now I send in a second batch of 10K invoke requests and this is when things look a bit strange. On the primary validator, it prints out a lot of

14:56:44.384 [rest] ServerOpenchainREST -> INFO 111d2 REST invoking chaincode...
14:56:44.387 [consensus/obcpbft] obcClassic -> INFO 111d3 New consensus request received
14:56:44.387 [rest] ServerOpenchainREST -> INFO 111d4 Successfuly invoked chainCode with txuuid (9d2231cd-46e3-4d74-8360-91c2501b840b)

Note that it's not printing out the "Replica 0 executing/committing request" messages anymore. And on all the replicating validators, besides a few receiving view-change messages, they are not printing out the "Replica X executing/committing request" messages either. It looks as if the consensus protocol is not being conducted, only the primary validator is doing the invoke request.

jyellick commented 8 years ago

It looks like this might be related to #699, if it is convenient for you to post the full logs, I could take a closer look to try to confirm.

gongsu832 commented 8 years ago

@jyellick here are the logs for the all the validators. I did 3 batches of 10K invoke requests. In the vp0 log, the first started at 14:53:41.545, the second started at 14:55:59.957, and the third started at 15:24:40.573. You can see that only the first batch prints out "Replica X executing/committing request ...". obc-peer.log.zip

tuand27613 commented 8 years ago

@jyellick , #803 sounds similar ?

corecode commented 8 years ago

Preliminary analysis

vp3 seems to be processing a bit slow and starts rejecting prepare messages starting from 2061, because its low-mark is still at 2040; last exec 2047. Soon after, vp3 discovers a stable checkpoint for 2070, but never manages to catch up with the other nodes. I don't know what it means by

[33m14:53:52.000 [peer] Handler -> WARN 925[0m Did NOT send SyncBlocks message to channel for range: 2046 - 2046

Eventually, state transfer complains about

[33m14:54:31.771 [consensus/statetransfer] StateTransferState -> WARN 9ec4[0m name:"vp3"  in tryOverPeers loop trying name:"vp2"  : name:"vp3"  had block sync request to name:"vp2"  time out
[31m14:54:31.772 [consensus/statetransfer] StateTransferState -> ERRO 9ec5[0m name:"vp3"  recovered to an incorrect state at block number 9503, (318fb098127ab82300ba6dd67e36bff967c993751181fc0285fc271fd7d3a6286cee1774f5ae10ff1f1c7d274b220d654d2db06e26ad79e89751adc41d39eac0 d738f0c1bc3c31b3a0b9937d4ed6bc87b1456dfe606b14320513f5f1012032da1456473542ada9eafbe820141e5408af95e04313a9b8c3616dfafa4ca850ada7) retrying

All the time while state transfer seems to be going on, replica 3 keeps sending out view-change messages, with increasing view (and increasing reference checkpoint).

For some reason, replicas 1 and 2 expire their view change timer 2s after having executed 9889. Their view-change message refers to 9880.
Even though replica 1 receives 3 view change messages for view=1 (and another from replica 3 for view=20), replica 1 never seems to send out a new-view message.

Actionables

Improve logging of state transfer progress (on level INFO).
Improve logging of view change timer reason (which outstanding request triggered this)
Improve logging on received consensus requests.

Points to investigate

Do we correctly set ourselves as out of state, and does the view-change system honor this?
Where do these extra view changes come from? Do we clean up once state transfer has been successful?
Do we need to extend our high watermark level (relative to checkpoint period), so that a fresh caught up replica can still receive plenty of additional requests?

corecode commented 8 years ago

@gongsu832 can you please always update to latest master? I see that the commit you retried is 20 days old, and many bugfixes have landed since then.

gongsu832 commented 8 years ago

Trying to repeat the tests after updating to commit tag ...

First thing to report is that I'm seeing the same error messages flooding vp1-vp3 reported in #819.

The good news is that the invoke test no longer breaks obcpbft. Each batch of 10000 invokes results in action on all validators. The only thing I noticed is that occasionally you see a warning message like the following:

12:10:04.218 [consensus/obcpbft] pbftCore -> WARN 70f07 Replica 0 found view-change message incorrect

It happens on all 4 validators.

Now the bad news is that query no longer works. It gets a "ledger: resource not found" error.

12:44:23.678 [crypto] main -> INFO 001 Log level recognized 'info', set to INFO
12:44:23.685 [main] serve -> INFO 003 Security enabled status: false
12:44:23.685 [eventhub_producer] eventProcessor -> INFO 002 event processor started
12:44:23.685 [main] serve -> INFO 004 Privacy enabled status: false
12:44:23.717 [buckettree] initConfig -> INFO 005 configs passed during initialization = map[string]interface {}{"numBuckets":10009, "maxGroupingAtEachLevel":10}
12:44:23.717 [buckettree] initConfig -> INFO 006 Initializing bucket tree state implemetation with configurations &{maxGroupingAtEachLevel:10 lowestLevel:5 levelToNumBucketsMap:map[0:1 1:2 2:11 3:101 4:1001 5:10009] hashFunc:0x807818e0}
12:44:23.720 [chaincode] NewChaincodeSupport -> INFO 007 Chaincode support using peerAddress: 0.0.0.0:30303
12:44:23.721 [rest] StartOpenchainRESTServer -> INFO 008 Initializing the REST service...
12:44:23.724 [main] serve -> INFO 009 Starting peer with id=name:"vp0" , network id=dev, address=0.0.0.0:30303, discovery.rootnode=, validator=true
12:44:23.724 [genesis] $nested0 -> INFO 00a Creating genesis block.
12:44:23.725 [genesis] $nested0 -> INFO 00b No genesis block chaincodes defined.
12:44:31.315 [consensus/obcpbft] newPbftCore -> INFO 00c PBFT type = *obcpbft.obcClassic
12:44:31.315 [consensus/obcpbft] newPbftCore -> INFO 00d PBFT Max number of validating peers (N) = 4
12:44:31.315 [consensus/obcpbft] newPbftCore -> INFO 00e PBFT Max number of failing peers (f) = 1
12:44:31.315 [consensus/obcpbft] newPbftCore -> INFO 00f PBFT byzantine flag = false
12:44:31.315 [consensus/obcpbft] newPbftCore -> INFO 010 PBFT request timeout = 2s
12:44:31.315 [consensus/obcpbft] newPbftCore -> INFO 011 PBFT view change timeout = 2s
12:44:31.315 [consensus/obcpbft] newPbftCore -> INFO 012 PBFT Checkpoint period (K) = 10
12:44:31.316 [consensus/obcpbft] newPbftCore -> INFO 013 PBFT Log multiplier = 2
12:44:31.316 [consensus/obcpbft] newPbftCore -> INFO 014 PBFT log size (L) = 20
12:44:54.850 [rest] ServerOpenchainREST -> INFO 015 REST deploying chaincode...
12:44:54.855 [container] writeGopathSrc -> INFO 016 rootDirectory = /opt/openchain/src
12:45:02.359 [consensus/obcpbft] obcClassic -> INFO 017 New consensus request received
12:45:05.077 [rest] ServerOpenchainREST -> INFO 018 Successfuly deployed chainCode: bb540edfc1ee2ac0f5e2ec6000677f4cd1c6728046d5e32dede7fea11a42f86a6943b76a8f9154f4792032551ed320871ff7b7076047e4184292e01e3421889c.
12:45:05.656 [consensus/obcpbft] pbftCore -> INFO 019 Replica 0 received view-change from replica 2, v:1, h:0, |C|:1, |P|:0, |Q|:1
12:45:05.932 [consensus/obcpbft] pbftCore -> INFO 01a Replica 0 received view-change from replica 3, v:1, h:0, |C|:1, |P|:0, |Q|:1
12:45:05.932 [consensus/obcpbft] pbftCore -> INFO 01b Replica 0 received f+1 view-change messages, triggering view-change to view 1
12:45:05.932 [consensus/obcpbft] pbftCore -> INFO 01c Replica 0 sending view-change, v:1, h:0, |C|:1, |P|:1, |Q|:1
12:45:05.932 [consensus/obcpbft] pbftCore -> INFO 01d Replica 0 received view-change from replica 0, v:1, h:0, |C|:1, |P|:1, |Q|:1
12:45:06.263 [consensus/obcpbft] pbftCore -> INFO 01e Replica 0 received view-change from replica 1, v:1, h:0, |C|:1, |P|:0, |Q|:1
12:45:07.932 [consensus/obcpbft] pbftCore -> INFO 01f Replica 0 view change timer expired, sending view change
12:45:07.933 [consensus/obcpbft] pbftCore -> INFO 020 Replica 0 sending view-change, v:2, h:0, |C|:1, |P|:1, |Q|:1
12:45:07.933 [consensus/obcpbft] pbftCore -> INFO 021 Replica 0 received view-change from replica 0, v:2, h:0, |C|:1, |P|:1, |Q|:1
12:46:13.881 [rest] ServerOpenchainREST -> INFO 022 REST querying chaincode...
12:46:13.885 [rest] ServerOpenchainREST -> ERRO 023 {"Error": "Querying Chaincode -- Error:Failed to launch chaincode spec(Could not get deployment transaction for bb540edfc1ee2ac0f5e2ec6000677f4cd1c6728046d5e32dede7fea11a42f86a6943b76a8f9154f4792032551ed320871ff7b7076047e4184292e01e3421889c - ledger: resource not found)"}

I used INFO instead of DEBUG log level. Because the periodic messages in DEBUG level make it hard to see things. If you need DEBUG level log, let me know.

gongsu832 commented 8 years ago

Sorry meant to say commit tag 3ef73d319b01fe84c0c88973350a240fc490d82e above.

tuand27613 commented 8 years ago

@gongsu832 , now that PR #833 is done, can you re-run ?

gongsu832 commented 8 years ago

OK updated to latest commit tag f08ccce33b13f23eb1af97a43bf654aa48129794 and reran the test.

4 nodes, consensus=obcpbft, security=disabled.

After starting obc-peer on all 4 nodes, vp1-3 logs were being polluted by the message:

16:06:04.721 [peer] PeerImpl -> ERRO 1e9 Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp1" 0.0.0.0:30303 VALIDATOR }

vp0 didn't have this problem.

Deploy chaincode_example02 then send in 10K invoke requests (all to vp0). It appears that only vp0 is printing out messages indicating it's executing the request:

16:04:47.593 [rest] ServerOpenchainREST -> INFO 036ESC[0m REST invoking chaincode...
16:04:47.598 [consensus/obcpbft] obcClassic -> INFO 037ESC[0m New consensus request received
16:04:47.598 [rest] ServerOpenchainREST -> INFO 038ESC[0m Successfuly invoked chainCode with txuuid (6712b9ec-510d-428b-8fdf-7309b182ae7f)

None of vp1-3 printed out any messages indicating they are executing the request. All of them just periodically print out some view-change messages like the following:

16:05:35.640 [consensus/obcpbft] pbftCore -> INFO 19cESC[0m Replica 1 received view-change from replica 0, v:26, h:0, |C|:1, |P|:1, |Q|:1
16:05:35.651 [consensus/obcpbft] pbftCore -> INFO 19dESC[0m Replica 1 view change timer expired, sending view change
16:05:35.652 [consensus/obcpbft] pbftCore -> INFO 19eESC[0m Replica 1 sending view-change, v:26, h:0, |C|:1, |P|:0, |Q|:1
16:05:35.652 [consensus/obcpbft] pbftCore -> INFO 19fESC[0m Replica 1 received view-change from replica 1, v:26, h:0, |C|:1, |P|:0, |Q|:1
16:05:35.652 [consensus/obcpbft] pbftCore -> INFO 1a0ESC[0m Replica 1 received view-change from replica 1, v:26, h:0, |C|:1, |P|:0, |Q|:1
ESC[33m16:05:35.652 [consensus/obcpbft] pbftCore -> WARN 1a1ESC[0m Replica 1 found view-change message incorrect

Log files for all nodes attached.

obc-peer.log.zip

tuand27613 commented 8 years ago

@gongsu832 , do your deploy/invoke/query work and do you get correct values ?

vp0 is the only one sending the new consensus request received and successfully invoked ... because it's the peer that's receiving the request. The others are receiving consensus messages which don't show up unless we have debug mode.

gongsu832 commented 8 years ago

@tuand27613 OK, with log level set to debug, I see the replicas printing out messages for each invoke request like the following:

17:10:58.479 [consensus/obcpbft] pbftCore -> DEBU 31ee Replica 1 received request: Z0Hu3A1S8mUENmMLkCzk7d1RGo5rw/djLME/zEGuzYjfLsGZptC/YqbqkocknTkRtOaNW+mpOXw8d+8TAKFhUg==

Query still doesn't work. I'm getting the same "ledger: resource not found" message I reported above 14 days ago.

Since query doesn't work, I can't really tell whether invoke did the right thing or not but the invoke REST call is successful.

tuand27613 commented 8 years ago

The Failed to launch chaincode spec(Could not get deployment transaction for ...... - ledger: resource not found) error says that we couldn't find the transaction based on its UUID ( @srderson, let me know if I'm all wet )

So @gongsu832 , could you append the debug logs again ? Also, what does your query call look like ?

gongsu832 commented 8 years ago

Here is the log with level=DEBUG. Note that the query request is close to the end after 10K invoke requests. Just search for "Failed". vp0-obc-peer.log.zip

And here is the query:

{
  "chaincodeSpec":{
      "type": "GOLANG",
      "chaincodeID":{
          "name":"bb540edfc1ee2ac0f5e2ec6000677f4cd1c6728046d5e32dede7fea11a42f86a6943b76a8f9154f4792032551ed320871ff7b7076047e4184292e01e3421889c"
      },
      "ctorMsg":{
          "function":"query",
          "args":["b"]
      }
  }
}

jyellick commented 8 years ago

@gongsu832 Thank you for the additional logs, it's now quite clear what the problem is.

Essentially, the many invokes are clogging message queue, causing PBFT to become very slow, which triggers a view change. Before the view change can complete, the request timer triggers again and causes yet another view change. Each time this occurs, the view change timer doubles. By the time the request flood is through, the view change is set to expire in 2^32 seconds, or essentially never.

This leads to two actionable things from the consensus/PBFT side of things. The request timer should not start when the current view is not active, this is a trivial one line change. Secondly, the view change timeout should be capped to some more reasonable, and configurable value, say, 1 minute. This should also be a fairly direct change.

The combination of these changes should ensure that once the flood of requests has abated, consensus progress will continue. However, this is still a bit of a denial of service opportunity, and I'll likely need to talk with @jeffgarratt and others to see if there's a better way we can proceed to better tolerate floods of client activity.

gongsu832 commented 8 years ago

@jyellick Thanks for looking into the problem and finding the potential cause. However, even without the 10K invoke requests, a query request after successful deploy also fails the same way, just like what I reported above 15 days ago. Here is the log with level=DEBUG (I tried query twice, both failed). vp0-obc-peer.log.zip

jyellick commented 8 years ago

@gongsu832 Yes, I see. Looks like we are dealing with multiple bugs here, as the other is definitely a problem as well. Could you collect DEBUG logs for all 4 validating peers? From your latest set of logs, I see missing replies to vp0 from the other vps which need to be investigated.

gongsu832 commented 8 years ago

@jyellick Here they are. obc-peer.log.zip

gongsu832 commented 8 years ago

@jyellick Do you also need DEBUG logs for all 4 peers for the query after 10K invoke case? If so I'll redo the tests.

jyellick commented 8 years ago

@gongsu832 The ones you sent should be just fine, thanks

corecode commented 8 years ago

@jyellick could you explain the message queue clogging issue in more detail?

jyellick commented 8 years ago

@gongsu832 Based on these new logs, it seems that there is network connectivity between vp0 and vp{1,2,3}, but no connectivity which does not include vp0. This is unrelated to the consensus, but causes consensus to never be achieved (but in a very odd way which causes many view changes which makes it appear to be consensus related). It seems unlikely to me that this is a firewall issue or some such, but could you make sure that there is network connectivity between some pair of vps, not including vp0? (So, for instance, can vp1 establish an ssh session to vp2?)

@jeffgarratt You can see the full logs above, but here is the most interesting bit to me:

[yellickj@jmobile logs]$  grep 'bi-directional' * | sed -e 's/address.*event/.../'
vp0-obc-peer.log:13:19:34.030 [peer] Handler -> DEBU 02f The Peer's bi-directional stream to ID:<name:"vp1" > ... DISC_HELLO
vp0-obc-peer.log:13:19:38.383 [peer] Handler -> DEBU 039 The Peer's bi-directional stream to ID:<name:"vp2" > ... DISC_HELLO
vp0-obc-peer.log:13:19:42.738 [peer] Handler -> DEBU 04d The Peer's bi-directional stream to ID:<name:"vp3" > ... DISC_HELLO
vp1-obc-peer.log:13:19:34.047 [peer] Handler -> DEBU 02f The Peer's bi-directional stream to ID:<name:"vp0" > ... DISC_HELLO
vp1-obc-peer.log:13:19:40.061 [peer] Handler -> DEBU 046 The Peer's bi-directional stream to ID:<name:"vp1" > ... DISC_HELLO
vp2-obc-peer.log:13:19:38.429 [peer] Handler -> DEBU 02f The Peer's bi-directional stream to ID:<name:"vp0" > ... DISC_HELLO
vp2-obc-peer.log:13:19:44.444 [peer] Handler -> DEBU 050 The Peer's bi-directional stream to ID:<name:"vp2" > ... DISC_HELLO
vp3-obc-peer.log:13:19:42.758 [peer] Handler -> DEBU 02f The Peer's bi-directional stream to ID:<name:"vp0" > ... DISC_HELLO
vp3-obc-peer.log:13:19:48.771 [peer] Handler -> DEBU 047 The Peer's bi-directional stream to ID:<name:"vp3" > ... DISC_HELLO
[yellickj@jmobile logs]$

Each peer is able to connect to vp0 and itself, but to no-one else. I'm not entirely sure how to read the rest of the messages, but maybe you'll have some insight?

@corecode The queue clogging turns out to be more of a non-issue after some discussions with @jeffgarratt. Some of the deadlock issues we'd had with Sieve and state transfer led me to believe the queues were less independent than they are. Still, I think it is important that replicas do not initiate a view change from request timeout if their view is not currently active, this breaks the view-timeout doubling scheme, and can make the view change timer enormous. If a byzantine replica could cause a denial of service for say, 60 seconds, successfully triggering 30 view changes via the 2 second request timer, then stop when the view made them the primary, that byzantine replica would have 2^30 seconds to complete the view change, which would be a problem.

gongsu832 commented 8 years ago

@jyellick vp0-3 are 4 z/VM guests. Every guest has connectivity to every other guest. I just tried and I could ssh from every guest to every other guest.

jyellick commented 8 years ago

@gongsu832 Thanks for checking, thought it seemed unlikely, but worth ruling out. I think below may be the solution.

In your openchain.yml file, could you ensure that peer.address which is normally set to 0.0.0.0:30303 is set to the IP of the machine (plus the :30303)? If you set this, you should also set peer.addressAutoDetect=false.

Alternatively, you could simply set peer.addressAutoDetect=true, and this might also work.

Finally, you can always set these values via the environment, with OPENCHAIN_PEER_ADDRESS and OPENCHAIN_PEER_ADDRESSAUTODETECT.

gongsu832 commented 8 years ago

@jyellick I never touch openchain.yaml and always use env vars to override settings I want to change.

I set OPENCHAIN_PEER_ADDRESS to host_ip:30303 and query now works. Unfortunately, I'm not getting correct values for "a", "b" using chaincode_example02. After deploy, I send in alternate invoke and query, i.e., invoke, query, invoke, query, etc.

The chaincode is deployed with ["a", "100", "b", "200"]. Each Invoke is invoked with ["a", "b", "10"]. Query queries ["a"]. And here are the results after 25 invoke/query pairs:

1.  invoke: success
    query: 100
2.  invoke: success
    query: 80
3.  invoke: success
    query: 80
4.  invoke: success
    query: 70
5.  invoke: success
    query: 60
6.  invoke: success
    query: 50
7.  invoke: success
    query: 40
8.  invoke: success
    query: 30
9.  invoke: success
    query: 10
10. invoke: success
    query: 10
11. invoke: success
    query: 0
12. invoke: success
    query: -10
13. invoke: success
    query: -20
14. invoke: success
    query: -30
15. invoke: success
    query: -40
16. invoke: success
    query: -50
17. invoke: success
    query: -60
18. invoke: success
    query: -60
19. invoke: success
    query: -80
20. invoke: success
    query: -90
21. invoke: success
    query: -90
22. invoke: success
    query: -90
23. invoke: success
    query: -90
24. invoke: success
    query: -90
25. invoke: success
    query: -90

It looks like invoke is async? Since there are missing values like 90 and there are repeating values like 80. What's more troublesome though is that after "a" gets to -90, i.e., after being subtracted 19 times (100-19*10=-90), it stops decreasing (I tried 10K pairs of invoke/query and "a" also stops decreasing at -90). I look at the source code for chaincode_example02 and it doesn't appear to have any check to stop "a" from decreasing.

Here are the DEBUG logs for all 4 peers. obc-peer.log.zip

Just for kicks, I deployed chaincode_example02 with ["a", "1000", "b", "2000"], and send in the same 1000 invoke/query pairs. "a" stops decreasing at 810, i.e., after being subtracted 19 times (1000-19*10=810).

jyellick commented 8 years ago

@gongsu832 Yes, the chaincode invocations are asynchronous, the request is submitted to consensus to be ordered and executed, and if you are interested in the result, there are some other mechanisms you will need to use. I think @muralisrini might be the right person to answer this question for you.

In regards to this magic 19 times and failure, this appears to because your blockchains are in an inconsistent state at startup. What is happening, is that PBFT 'checkpoints' in order to garbage collect internal state, this means that enough replicas all agree on the state of the blockchain. When the chains are not the same, no consensus can be achieved, and progress will stall once the log size (20 by default, here 1 deploy, and 19 invokes) is exceeded. Presently, the PBFT implementation only supports starting from a clean state, ie no existing blockchain. Looking at your logs, I see that your blockchains are not all of the same length, and may not contain the same content. I suggest you delete the contents of your blockchains and try from a fresh genesis block, @srderson should be able to help you do this.

gongsu832 commented 8 years ago

@jyellick I'm using jmeter to see the performance of invoke and query. If invoke is async, that makes interpreting the throughput number a bit difficult, i.e., if jmeter tells me the invoke throughput is 200/sec, then it simply means the system is able to accept the requests and send back OK 200/sec. But we have no idea how long it actually took to execute those invoke requests. The effective throughput could be much lower than 200/sec.

The magic 19 times failure is due to my stupidity. I actually always start the nodes with clean states. But due to a bug in my script, I thought I was rm -rf /var/openchain/production/* but instead I was rm -rf /var/openchain/production/production/* DOH! Fixed my script and now the numbers appear to come out right. So that was good. Thanks.

jyellick commented 8 years ago

@gongsu832 Glad at least the executions are working better for you now.

The system was definitely designed with transaction submission being purposefully asynchronous. We are rapidly leaving my area of expertise (consensus), but I believe there is an eventing mechanism you can hook into. @muralisrini will need to weigh in here though.

tuand27613 commented 8 years ago

Also check with @corecode who's been through some of these measurement issues during his perf runs

gongsu832 commented 8 years ago

@jyellick yeah thanks for helping identifying the problems. The need to set peer.address to IP address instead of 0.0.0.0 for query to work probably needs to be documented.

@tuand27613 thanks I can talk with @muralisrini and @corecode offline on the async throughput issue but it's not an urgent matter at this moment. The jmeter numbers at least give some ball-park estimation.

Next I'm going to turn on security+privacy and see what comes out. :-)

corecode commented 8 years ago

check out my perf branch for a commit that allows for closed loop testing.

gongsu832 commented 8 years ago

@corecode thanks. Can you be a bit more specific? I'm not well versed in git/github. Where do I find your branch? I need to patch the code in order for it to work on zLinux so I'll have to do the same with your branch.

By "closed loop testing", do you mean you have to do the test in a specific way? Do you have doc in your branch describing how to do the test? Thanks.

jyellick commented 8 years ago

@gongsu832 Can this issue be closed (as the original 'bug' is resolved)? If you'd like to open a new issue for your performance testing questions, if you tag me, I will tag it as a consensus question and assign it.

gongsu832 commented 8 years ago

@jyellick Yes you can close it. I haven't heard from @corecode on where to get his branch and how to run the "closed loop testing" he mentioned. When I do and if I have issues, I'll open a new issue and tag you. Thanks again for the help.

jyellick commented 8 years ago

@gongsu832 You can find @corecode's branches here

If I had to make an educated guess, I'd think you might want the branch closed-loop-rest, but @corecode is definitely the person to answer you here.

hyperledger-archives / fabric