Closed fshutdown closed 5 years ago
I'm using following setup:
C# node (with staking enabled, connected to the network) -- QT node (connected only to C# node)
During IBD QT stopped advancing.
So the problem is reproducible, will be looking into why this is happening.
I was able to reproduce a bug several times but it never lead to QT node being stuck for longer than 3-5 minutes.
When QT syncs from C# node it can occasionally drop the connection Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host
after which it would reconnect and it would take some time to start syncing again.
I've also checked logs provided by @maciejzaleski and didn't find there a disconnection which was a precursor to the situation I found somewhat problematic.
However there might be a problem in the setup that might have caused the bug- bidirectional connections.
HA:
Peer:[::ffff:192.168.98.101]:37021, connected:inbound, height:1241, agent:StratisBitcoin:1.2.5
Peer:[::ffff:192.168.98.101]:16178, connected:outbound, height:1241, agent:StratisBitcoin:1.2.5
Peer:[::ffff:192.168.98.171]:43076, connected:inbound, height:0, agent:/Stratis:2.0.0.5/
HBX:
receive version message: version 70012, blocks=0, us=192.168.98.171:16178, them=127.0.0.1:16178, peer=192.168.98.174:40377
Added time data, samples 2, offset +0 (+0 minutes)
keypool reserve 1
keypool return 1
keypool reserve 1
keypool return 1
receive version message: version 70012, blocks=0, us=192.168.98.171:16178, them=127.0.0.1:16178, peer=192.168.98.174:39649
keypool reserve 1
keypool return 1
keypool reserve 1
keypool return 1
keypool reserve 1
keypool return 1
Adding fixed seed nodes as DNS doesn't seem to be available.
keypool reserve 1
keypool return 1
keypool reserve 1
keypool return 1
keypool reserve 1
keypool return 1
receive version message: version 70012, blocks=0, us=192.168.98.171:16178, them=127.0.0.1:16178, peer=192.168.98.174:38507
keypool reserve 1
keypool return 1
keypool reserve 1
keypool return 1
keypool reserve 1
keypool return 1
keypool reserve 1
keypool return 1
receive version message: version 70012, blocks=0, us=192.168.98.171:43076, them=127.0.0.1:16178, peer=192.168.98.170:16178
@maciejzaleski could you please run the network on latest master ensuring that there are no bidirectional connections and check if the bug still appears?
Network State
SBFN version
58d3d1dd (01 Dec @ 13:54); Patches: Kevin
*** Build date: 01/12/2018 19:56:38.64
*** Recent commits
* 58d3d1dd - (HEAD -> master, origin/master, origin/HEAD) Fix consecutive header bug (#2865) (Sat, 1 Dec 2018 13:54:20 +0000) <Francois de la Rouviere>
* 700c871a - Correctly store and load rewind data index, (#2875) (Fri, 30 Nov 2018 23:46:54 +0000) <Dan Gershony>
* c63e0c36 - Fix for "PHBS can serve reorged headers" (#2876) (Fri, 30 Nov 2018 18:13:46 +0300) <noescape0>
* 4d05de89 - Update SeedData (#2872) (Fri, 30 Nov 2018 10:04:03 +0000) <StratisIain>
* 08dc1140 - (origin/sc/v0.12.0-beta) nameof variables fix (#2868) (Thu, 29 Nov 2018 16:48:10 +0000) <Fazz>
* 0403c3cf - Inbound/outbound peer connection count (#2866) (Thu, 29 Nov 2018 15:49:19 +0000) <Fazz>
* a981244c - (R/C) tip height, refactoring, IBD status (#2856) (Thu, 29 Nov 2018 13:45:13 +0000) <Fazz>
* a1f70708 - Fix two tests (#2859) (Thu, 29 Nov 2018 10:32:25 +0000) <Francois de la Rouviere>
* 4c308677 - Add default connection params to SC networks (#2858) (Thu, 29 Nov 2018 17:45:09 +1100) <Rowan de Haas>
* 914e2549 - [ProvenHeaders] Set PHBS tip to ChainTip if its ahead (#2853) (Wed, 28 Nov 2018 18:56:18 +0000) <Francois de la Rouviere>
******************
Based on info received from @maciejzaleski I'm now trying another setup to reproduce the bug:
NETWORK --> C# A (fully synced) --> C# B --> QT
C# B and QT node is syncing from scratch C# B has IBD always disabled
Expected outcome: QT node gets stuck and can't sync fully.
This one is fixed by #2893
@fassadlr why reopened?
@noescape00 we can't close it until the testers has retested it... It is currently in the Re-Test column :)
Fair point 👍
Retest failed, logs: https://stratisplatform-my.sharepoint.com/:u:/p/maciej_zaleski/Ee6crzATL1BLm0ZMeZV6IywBVsRW-qLpnmh9FIRlnMiO2w?e=OUXp7V
Code version
9535e61e (06 Dec @ 15:32); Patches: Kevin
*** Build date: 06/12/2018 12:34:06.68
*** Recent commits
* 9535e61e - (HEAD -> master, origin/master, origin/HEAD) syncing speedup (#2901) (Thu, 6 Dec 2018 15:32:25 +0300) <noescape0>
* 63b750a3 - Fix DoS vector (#2904) (Thu, 6 Dec 2018 12:53:36 +0300) <noescape0>
* 3719dab9 - Set the transaction fee to the network defined minimum fee (#2907) (Thu, 6 Dec 2018 08:05:27 +0000) <Jeremy Bokobza>
* e2af0c02 - (origin/sc/v0.13.0-beta) Add version to network folder name (#2915) (Thu, 6 Dec 2018 15:11:18 +1100) <Rowan de Haas>
* f4a8ec92 - Increment magic and nonce (#2914) (Thu, 6 Dec 2018 15:05:31 +1100) <Rowan de Haas>
* 0c313825 - Increment SC version (#2911) (Thu, 6 Dec 2018 12:20:01 +1100) <Rowan de Haas>
* 0f93ca37 - RuntimeObserver -> Stratis.SmartContracts.RuntimeObserver (#2912) (Thu, 6 Dec 2018 12:13:24 +1100) <Jordan Andrews>
* 5c8c08f9 - Powershell NuGet fix + version updates (#2900) (Thu, 6 Dec 2018 10:41:10 +1100) <Jordan Andrews>
* a6a8d2b7 - Added custom path for KeyTool (#2902) (Wed, 5 Dec 2018 17:05:43 +0000) <Jeremy Bokobza>
* 528337cc - Missing inputs error was not reporting correctly the error (#2906) (Wed, 5 Dec 2018 16:53:23 +0000) <Dan Gershony>
******************
You could probably reproduce that by syncing StratisX, shutting it down for time which is shorter that the time we have in the IBD check method (around 100 blocks) and than starting StratisX again so that it can resync
The issue has been resolved
This test is based on the latest network topology: https://github.com/maciejzaleski/InternalTestnet/blob/master/Documentation/FullNode/InternalTestnet-NetworkDesign.draw.io.svg
Setup
Logs: https://stratisplatform-my.sharepoint.com/:u:/p/maciej_zaleski/EcMzTK1Y_dVCtwTpkxGH96EB0wn0YJJWj_bK-huvz8Khfg?e=de7g7L
Network View