Open sachikoy opened 8 years ago
@cca88 As you said,"We understand the need to synchronize all nodes after more time passes, but this is not yet implemented." But when does all nodes synchronize? I am very confused how to synchronize all nodes and when to synchronize? I did like @sachikoy.At first,the data is not consistent.When i stoped peer2 or peer3 some times,and deploy and invoke chaincode some times,and I restart peers, and all peers are consistent.Here is the log:
07:20:52.442 [peer] handleChat -> ERRO 082 Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp1" 172.0.0.55:30303 VALIDATOR } 07:21:12.117 [consensus/util] RegisterChannel -> WARN 083 Received duplicate connection from <nil>, switching to new connection 07:22:34.028 [consensus/obcpbft] recvViewChange -> INFO 084 Replica 3 received view-change from replica 1, v:1, h:10, |C|:1, |P|:6, |Q|:6 07:22:34.108 [consensus/obcpbft] recvViewChange -> INFO 085 Replica 3 received view-change from replica 4, v:1, h:10, |C|:1, |P|:6, |Q|:6 07:22:36.028 [consensus/obcpbft] processEvent -> INFO 086 Replica 3 view change timer expired, sending view change 07:22:36.028 [consensus/obcpbft] sendViewChange -> INFO 087 Replica 3 sending view-change, v:2, h:0, |C|:1, |P|:15, |Q|:16 07:22:36.029 [consensus/obcpbft] recvViewChange -> INFO 088 Replica 3 received view-change from replica 3, v:2, h:0, |C|:1, |P|:15, |Q|:16 07:22:36.029 [consensus/obcpbft] loop -> WARN 089 Attempting to stop an unfired idle timer 07:22:36.029 [consensus/obcpbft] recvViewChange -> INFO 08a Replica 3 received view-change from replica 0, v:2, h:0, |C|:2, |P|:16, |Q|:17 07:22:36.030 [consensus/obcpbft] recvViewChange -> INFO 08b Replica 3 received view-change from replica 2, v:2, h:0, |C|:1, |P|:15, |Q|:16 07:22:36.032 [consensus/obcpbft] recvViewChange -> INFO 08c Replica 3 received view-change from replica 1, v:2, h:10, |C|:1, |P|:6, |Q|:6 07:22:36.033 [consensus/obcpbft] recvViewChange -> INFO 08d Replica 3 received view-change from replica 4, v:2, h:10, |C|:1, |P|:6, |Q|:6 07:22:36.038 [consensus/obcpbft] recvNewView -> INFO 08e Replica 3 received new-view 2 07:22:36.039 [consensus/obcpbft] processNewView -> WARN 08f Replica 3 missing base checkpoint 10 (CAsSQO3LcF3sA1A+77mumM+7mvhGY6nHNVILPmp6P4PARCIK+clPn9K3Pi/lQ1tAUTkXY56YplcIuW6SZPiIVHywU6AaQNYhUlbq9E31Ot4mGTdb06KEgTViJ9/ZFONpBC4fElO/YTEEndAhYEctz/+MMC/HdUHzF+qkf0PIyAXIdmNSXPo=) 07:22:36.040 [consensus/handler] SkipTo -> WARN 090 State transfer is being called for, but the state has not been invalidated 07:22:36.040 [consensus/obcpbft] processNewView2 -> INFO 091 Replica 3 accepting new-view to view 2 07:22:36.341 [consensus/obcpbft] loop -> WARN 092 Attempting to stop an unfired idle timer 07:22:36.342 [consensus/obcpbft] loop -> WARN 093 Attempting to stop an unfired idle timer 07:22:36.343 [consensus/obcpbft] loop -> WARN 094 Attempting to stop an unfired idle timer 07:22:36.345 [consensus/obcpbft] loop -> WARN 095 Attempting to stop an unfired idle timer 07:22:36.346 [consensus/obcpbft] loop -> WARN 096 Attempting to stop an unfired idle timer 07:22:36.348 [consensus/obcpbft] loop -> WARN 097 Attempting to stop an unfired idle timer 07:22:36.349 [consensus/obcpbft] loop -> WARN 098 Attempting to stop an unfired idle timer 07:22:36.350 [consensus/obcpbft] loop -> WARN 099 Attempting to stop an unfired idle timer 07:22:37.554 [consensus/obcpbft] processEvent -> INFO 09a Replica 3 application caught up via state transfer, lastExec now 10 07:22:37.555 [consensus/obcpbft] executeOne -> INFO 09b Replica 3 executing/committing request for view=2/seqNo=11 and digest VNgkd+K5LG9ZXm4SowbIfSky4irYa7gvRqy7Uv0YOPsCsTfQRSsCOWBV9gn46RR6RH8lmY2CUeCH0JW9dD/ECg== 07:22:55.341 [consensus/obcpbft] execDoneSync -> INFO 09c Replica 3 finished execution 11, trying next 07:22:55.342 [consensus/obcpbft] executeOne -> INFO 09d Replica 3 executing/committing request for view=2/seqNo=12 and digest wjOnFNHPJElPfUnULjFwotEvZ103D5L9mkSe0kNPjHxXUEQ3sqFwjZMISKDJAJYH4tiz4eg168xXoOvXRbOxzw== 07:22:55.346 [consensus/obcpbft] execDoneSync -> INFO 09e Replica 3 finished execution 12, trying next 07:22:55.347 [consensus/obcpbft] executeOne -> INFO 09f Replica 3 executing/committing request for view=2/seqNo=13 and digest 9LZq13v9biLJgEGNVb9yrHCh8NBw3n6KqR4giL66sxq4jqjr1MVXOKZSGS6fioUNqmU7hODkTtWCNsDev6k/xA== 07:22:55.888 [consensus/obcpbft] execDoneSync -> INFO 0a0 Replica 3 finished execution 13, trying next 07:22:55.888 [consensus/obcpbft] executeOne -> INFO 0a1 Replica 3 executing/committing request for view=2/seqNo=14 and digest GCK9aJ7UgbH4Fq0aF25NRszCZ8JDFEsjrnis+KOh65FMSS3GxexOu0fVXbZrN4InqPkPl9FTsQpBLtSW9uoX3w== 07:23:07.285 [consensus/obcpbft] execDoneSync -> INFO 0a2 Replica 3 finished execution 14, trying next 07:23:07.285 [consensus/obcpbft] executeOne -> INFO 0a3 Replica 3 executing/committing request for view=2/seqNo=15 and digest kqP4gKcNvepiStEF5bJa8M4/7jqj/Gc4OuWY1IQYOi5ACDdKTchjcmR3xEFHpC+3rbUb3v0ziALKo3yVVYDJFA== 07:23:26.229 [consensus/obcpbft] execDoneSync -> INFO 0a4 Replica 3 finished execution 15, trying next 07:23:26.229 [consensus/obcpbft] executeOne -> INFO 0a5 Replica 3 executing/committing request for view=2/seqNo=16 and digest CpcJQd1nxUMunUAKytSbryWKOcqXXuK552qyM9Y6rI1EvXXhptMPEJd1BXrfRqEcJsjE8+seZM/MB8UDNOFXrw== 07:23:27.711 [consensus/obcpbft] execDoneSync -> INFO 0a6 Replica 3 finished execution 16, trying next 07:23:27.711 [consensus/obcpbft] executeOne -> INFO 0a7 Replica 3 executing/committing request for view=2/seqNo=17 and digest em1uHEGa5nS+1mamx89vMRIsmu2RM+b3kEqZhJqtIkcuB/JE5YDEI3ZzdHFixuxO5vXa1SxfnQh2ccCNrleUBg== 07:23:28.565 [consensus/obcpbft] execDoneSync -> INFO 0a8 Replica 3 finished execution 17, trying next 07:23:28.568 [consensus/obcpbft] executeOne -> INFO 0a9 Replica 3 executing/committing request for view=2/seqNo=18 and digest ptvCPHdlSxwbhE9urbjVU3ny0iXT45B/tk7BdJQkeAi9k8IhjV8GmExJCJHYLOtpsvfRL6UJuEJLkc/hX+8nDQ== 07:23:28.577 [consensus/obcpbft] execDoneSync -> INFO 0aa Replica 3 finished execution 18, trying next
I do just like what I did before,but the peers can not synchronize again. And I synchronize all peers only one time.
@cophey Please clearly explain your expectations and what you observe.
@corecode I am sorry that my description is not clear.
Here is the precise steps I did:
CORE_PEER_VALIDATOR_CONSENSUS_PLUGIN=pbft and CORE_PBFT_GENERAL_N=5
.I don't know why all peers are consistent at last.And I repeat the above steps,but the result is different that peers are not consistent anyhow.It makes me confused.When a peer catches up?
in addition,in the steps,the logs appears this warnings:
01:01:53.606 [consensus/util] RegisterChannel -> INFO 01e Registering connection from <nil> 01:01:56.328 [consensus/util] RegisterChannel -> WARN 01f Received duplicate connection from <nil>, switching to new connection 01:02:19.612 [consensus/util] RegisterChannel -> WARN 020 Received duplicate connection from <nil>, switching to new connection 01:02:20.525 [consensus/util] RegisterChannel -> WARN 021 Received duplicate connection from <nil>, switching to new connection 01:02:20.541 [peer] handleChat -> ERRO 022 Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp3" 172.0.0.52:30303 VALIDATOR } 01:03:39.616 [consensus/util] RegisterChannel -> WARN 023 Received duplicate connection from <nil>, switching to new connection 01:03:40.521 [consensus/util] RegisterChannel -> WARN 024 Received duplicate connection from <nil>, switching to new connection 01:03:40.522 [peer] handleChat -> ERRO 025 Error handling message: Peer FSM failed while handling message (DISC_HELLO): current state: created, error: transition canceled with error: Error registering Handler: Duplicate Handler error: {name:"vp4" 172.0.0.43:30303 VALIDATOR }
@cophey please attach a zip file with the logs from all peers. In step 4, can you define how many transactions you invoke and how much time approximately you spend in that step?
@kchristidis I am sorry that i cannot recur it.I just synchronize all peers successfully only one time and have no logs.so sorry ablout that.
in step 4,i invoke about 6 times.It spends ablout one or two minutes.I am so sorry I remember not very clear.
anyway,what i want to do is:
@sachikoy this thread has gone silent... is it still an issue? Do we have tests that can assure expected behavior?
I am testing on an environment with 4 VPs (vp0, .. vp3) with PBFT batch and security enabled. When I stop one peer and restart it, the peer's state becomes inconsistent with other peers, yet all peers keep making consensus.
Here is the precise steps I did: