input-output-hk / cardano-sl

Cryptographic currency implementing Ouroboros PoS protocol
Apache License 2.0
3.76k stars 632 forks source link

demo.sh no longer works on master #1869

Open arybczak opened 7 years ago

arybczak commented 7 years ago

This is the result of running demo on master (36f4b82152df971fef531c4ccccd7e65f8e2e8e9):

unknown@electronics cardano-sl $ /home/unknown/Programowanie/haskell/cardano-sl/.stack-work/install/x86_64-linux/lts-9.1/8.0.2/bin/cardano-node-simple --db-path /home/unknown/Programowanie/haskell/cardano-sl/scripts/../run/node-db0   --rebuild-db   --genesis-secret 0 --listen 127.0.0.1:3000  --json-log=/home/unknown/Programowanie/haskell/cardano-sl/scripts/../logs/2017-11-06_153541/node0.json  --logs-prefix /home/unknown/Programowanie/haskell/cardano-sl/scripts/../logs/2017-11-06_153541 --log-config /home/unknown/Programowanie/haskell/cardano-sl/scripts/../logs/2017-11-06_153541/conf/node0.log.yaml          --system-start 1509978956 --metrics +RTS -T -RTS --ekg-server 127.0.0.1:8000 --node-id node0 --topology ./run/topology0.yaml --no-ntp
[*production*:INFO:ThreadId 4] [2017-11-06 14:36:25.01 UTC] Asserts are ON
[*production*:INFO:ThreadId 4] [2017-11-06 14:36:25.01 UTC] ConfigurationOptions {cfoFilePath = "lib/configuration.yaml", cfoKey = "default", cfoSystemStart = Just 1509978956000000, cfoSeed = Nothing}
[*production*:INFO:ThreadId 4] [2017-11-06 14:36:25.47 UTC] withConfiguration using custom system start time 1509978956000000
[*production*:INFO:ThreadId 4] [2017-11-06 14:36:25.88 UTC] System start time is 1509978956000000
[*production*:INFO:ThreadId 4] [2017-11-06 14:36:25.88 UTC] Current time is 1509978985888150
[*production*:INFO:ThreadId 4] [2017-11-06 14:36:25.88 UTC] Wallet is disabled, because software is built w/o it
[*production*:INFO:ThreadId 4] [2017-11-06 14:36:25.88 UTC] Using configs and genesis:
ConfigurationOptions {cfoFilePath = "lib/configuration.yaml", cfoKey = "default", cfoSystemStart = Just 1509978956000000, cfoSeed = Nothing}
[*production*:DEBUG:ThreadId 4] [2017-11-06 14:36:25.95 UTC] Loading SSC global state
[*production*:INFO:ThreadId 4] [2017-11-06 14:36:25.95 UTC] Loaded SSC state:   certificates from: [df4901e8, f83e9e02, c96ac1e1, 81713ef4]

[*production*:WARNING:ThreadId 4] [2017-11-06 14:36:25.95 UTC] notifyReady failed to notify systemd.
[node:INFO:ThreadId 4] [2017-11-06 14:36:25.95 UTC] Our verInfo: VerInfo { magic=34fa031, blockVersion=0.0.0, inSpecs={96: Conv 43, 49: Conv 5c, 97: Conv 3d, 34: Conv 5e, 98: Conv 37, 67: Conv 60, 83: Conv 0, 4: Conv 5, 5: Conv 4, 37: Conv 5e, 6: Conv 7, 55: Conv 62, 73: Conv 5f, 43: Conv 5d, 92: Conv 31, 61: Conv 61, 93: Conv 2b, 94: Conv 25, 95: Conv 49}, outSpecs={49: Conv 5c, 67: Conv 60, 83: Conv 0, 4: Conv 5, 5: Conv 4, 37: Conv 5e, 6: Conv 7, 55: Conv 62, 73: Conv 5f, 43: Conv 5d, 61: Conv 61}}
[node:DEBUG:ThreadId 4] [2017-11-06 14:36:25.95 UTC] startNode, we are NodeId 127.0.0.1:3000:0
[node.ntp-check:WARNING:ThreadId 4] [2017-11-06 14:36:25.96 UTC] Failed to create sockets, retrying in 5 sec... (reason: Network.Socket.socket: unsupported operation (Address family not supported by protocol))
[node:ERROR:ThreadId 4] [2017-11-06 14:36:30.96 UTC] node stopped with exception Network.Socket.socket: unsupported operation (Address family not supported by protocol)
[node:ERROR:ThreadId 4] [2017-11-06 14:36:30.97 UTC] node stopped with exception Network.Socket.socket: unsupported operation (Address family not supported by protocol)
[node:DEBUG:ThreadId 21] [2017-11-06 14:36:30.97 UTC] waiting for [] outbound bidirectional handlers
[node:DEBUG:ThreadId 21] [2017-11-06 14:36:30.97 UTC] waiting for [] outbound inbound
[node:ERROR:ThreadId 4] [2017-11-06 14:36:30.97 UTC] exception while stopping node thread killed
[node:ERROR:ThreadId 4] [2017-11-06 14:36:30.97 UTC] exception while stopping node thread killed
cardano-node-simple: thread killed

It works fine on cardano-sl-1.0 branch and master @ d4551608f725ff59f8ccd847f440444ccd6dab95 (2017-10-12), so something broke along the way.

ab45e114754a06da88ee39e8f9d3dbdd59e28643 is the first bad commit
commit ab45e114754a06da88ee39e8f9d3dbdd59e28643
Author: Vladislav Zavialov <vlad.z.4096@gmail.com>
Date:   Fri Oct 13 21:39:45 2017 +0300

    [CSL-1769] Use log-warper 1.3.3

:040000 040000 65061c4b2005b3e3dc1420e47066ca0383fa844d f494179a42a0ae525f4845bb87d3ddbb4d1cc130 M      core
:040000 040000 4c34af00213adeffd507544d0e98da8430968a62 8ba0600b16a8714c7c4850401b87f9c0d1cf72f5 M      explorer
:040000 040000 783a8e9087ba15588d2b71eba2cb9ba2f504ce7b 1ebce72998ef283103953fc8df89e18c7fb2ef5d M      godtossing
:040000 040000 863a5dbc73f0a414e7a79b7f407f9744c66285ba f48277413dc150f41df1c64368ee8c0ee4d99860 M      lib
:040000 040000 167b059c576fb6bcc6d6e0aa829fd71c0ac59763 9ecf0a1e8767aeb7b758bb75901349b255395036 M      node
:040000 040000 2435002fd7b345f47a8e362b568fd5f8b3e72009 00296a99130e8c553794880b0390d36e5955ced3 M      pkgs
:100644 100644 2f0342f485155f07cd3004efccb2fdf16896ba12 4cd303049c40334287adcbd1d6f3d0830d8fa52e M      stack.yaml
:040000 040000 b9543bbc54721119dfee90fe1de4fca4fd9279e7 94897a8c14c51aba98ee1c19f8a1ab30a40c5b3f M      tools
:040000 040000 59e25e2258513787d879dbc1672219b1f604dee1 0382d32e70ce26256a08b70c8277eb7deefceb25 M      update
:040000 040000 d1080c2411ee96e209ffea08d4346a1be57dda68 22f5e4c5cb49cc21b80f23eb7cdf25304fa3ddc1 M      wallet

If you look closer at this commit, you'll see that it updates snapshot of time-warp-nt in stack.yaml, so I presume whatever causes this originates from there.

ab45e114754a06da88ee39e8f9d3dbdd59e28643~1 is fine, with ab45e114754a06da88ee39e8f9d3dbdd59e28643 I get this:

unknown@electronics cardano-sl $ /home/unknown/Programowanie/haskell/cardano-sl/.stack-work/install/x86_64-linux/lts-9.1/8.0.2/bin/cardano-node-simple --db-path /home/unknown/Programowanie/haskell/cardano-sl/scripts/../run/node-db0   --rebuild-db   --genesis-secret 0 --listen 127.0.0.1:3000  --json-log=/home/unknown/Programowanie/haskell/cardano-sl/scripts/../logs/2017-11-06_155448/node0.json  --logs-prefix /home/unknown/Programowanie/haskell/cardano-sl/scripts/../logs/2017-11-06_155448 --log-config /home/unknown/Programowanie/haskell/cardano-sl/scripts/../logs/2017-11-06_155448/conf/node0.log.yaml            --system-start 1509980103 --metrics +RTS -T -RTS --ekg-server 127.0.0.1:8000 --node-id node0 --topology ./run/topology0.yaml --no-ntp
Asserts are ON
[*production*:WARNING:ThreadId 4] [2017-11-06 14:54:50.43 UTC] notifyReady failed to notify systemd.
[node:WARNING:ThreadId 145] [2017-11-06 14:55:07.30 UTC] node0: sending MsgAnnounceBlockHeader OriginSender to NodeId 127.0.0.1:3004:0 failed with TransportError ConnectNotFound "Network.Socket.connect: <socket: 41>: does not exist (Connection refused)" :: SomeException
[node.worker.ssc:WARNING:ThreadId 28] [2017-11-06 14:55:10.00 UTC] We have rejected our message, reason: some nodes have already sent their commitments: [5f53e01e]
[node.worker.block:WARNING:ThreadId 193] [2017-11-06 14:55:14.01 UTC] node0: could not choose suitable alternative from [NodeId 127.0.0.1:3004:0]
[node.worker.block:ERROR:ThreadId 193] [2017-11-06 14:55:14.01 UTC] node0: enqueue instruction EnqueueAll {enqNodeType = NodeRelay, enqMaxAhead = MaxAhead 0, enqPrecedence = PHighest} failed to enqueue message MsgAnnounceBlockHeader OriginSender to forwarding sets [[NodeId 127.0.0.1:3004:0]]
[node.worker.block:ERROR:ThreadId 193] [2017-11-06 14:55:14.02 UTC] node0: enqueue instruction EnqueueAll {enqNodeType = NodeRelay, enqMaxAhead = MaxAhead 0, enqPrecedence = PHighest} failed to enqueue message MsgAnnounceBlockHeader OriginSender to forwarding sets [[NodeId 127.0.0.1:3004:0]]
[node:WARNING:ThreadId 282] [2017-11-06 14:55:21.28 UTC] node0: could not choose suitable alternative from [NodeId 127.0.0.1:3004:0]
[node:ERROR:ThreadId 282] [2017-11-06 14:55:21.28 UTC] node0: enqueue instruction EnqueueAll {enqNodeType = NodeRelay, enqMaxAhead = MaxAhead 0, enqPrecedence = PHighest} failed to enqueue message MsgAnnounceBlockHeader OriginSender to forwarding sets [[NodeId 127.0.0.1:3004:0]]
[node:ERROR:ThreadId 282] [2017-11-06 14:55:21.29 UTC] node0: enqueue instruction EnqueueAll {enqNodeType = NodeRelay, enqMaxAhead = MaxAhead 0, enqPrecedence = PHighest} failed to enqueue message MsgAnnounceBlockHeader OriginSender to forwarding sets [[NodeId 127.0.0.1:3004:0]]
[node:WARNING:ThreadId 329] [2017-11-06 14:55:28.28 UTC] node0: could not choose suitable alternative from [NodeId 127.0.0.1:3004:0]
[node:ERROR:ThreadId 329] [2017-11-06 14:55:28.28 UTC] node0: enqueue instruction EnqueueAll {enqNodeType = NodeRelay, enqMaxAhead = MaxAhead 0, enqPrecedence = PHighest} failed to enqueue message MsgAnnounceBlockHeader OriginSender to forwarding sets [[NodeId 127.0.0.1:3004:0]]
[node:ERROR:ThreadId 329] [2017-11-06 14:55:28.28 UTC] node0: enqueue instruction EnqueueAll {enqNodeType = NodeRelay, enqMaxAhead = MaxAhead 0, enqPrecedence = PHighest} failed to enqueue message MsgAnnounceBlockHeader OriginSender to forwarding sets [[NodeId 127.0.0.1:3004:0]]
[node:WARNING:ThreadId 376] [2017-11-06 14:55:35.28 UTC] node0: could not choose suitable alternative from [NodeId 127.0.0.1:3004:0]
[node:ERROR:ThreadId 376] [2017-11-06 14:55:35.28 UTC] node0: enqueue instruction EnqueueAll {enqNodeType = NodeRelay, enqMaxAhead = MaxAhead 0, enqPrecedence = PHighest} failed to enqueue message MsgAnnounceBlockHeader OriginSender to forwarding sets [[NodeId 127.0.0.1:3004:0]]
[node:ERROR:ThreadId 376] [2017-11-06 14:55:35.28 UTC] node0: enqueue instruction EnqueueAll {enqNodeType = NodeRelay, enqMaxAhead = MaxAhead 0, enqPrecedence = PHighest} failed to enqueue message MsgAnnounceBlockHeader OriginSender to forwarding sets [[NodeId 127.0.0.1:3004:0]]
[node.worker.block:ERROR:ThreadId 42] [2017-11-06 14:55:45.00 UTC] Reporting critical misbehavior with reason "Poor chain quality for the last 'k' (2) blocks, less than 0.889: 0.667"
[node.worker.block:ERROR:ThreadId 42] [2017-11-06 14:55:45.01 UTC] Reporting critical misbehavior with reason "Poor chain quality for the last 'k' (2) blocks, less than 0.889: 0.667"
^C[node:ERROR:ThreadId 4] [2017-11-06 14:55:50.07 UTC] node stopped with exception user interrupt
[node.worker.block:ERROR:ThreadId 44] [2017-11-06 14:55:50.07 UTC] retrievalWorker mainLoopE: error caught thread killed
[node.worker.block:ERROR:ThreadId 44] [2017-11-06 14:55:50.07 UTC] retrievalWorker mainLoopE: error caught thread killed
cardano-node-simple: thread killed

Note that this is a different error - when I was bisecting I was either getting this or the one I posted above.

avieth commented 7 years ago

[node.ntp-check:WARNING:ThreadId 4] [2017-11-06 14:36:25.96 UTC] Failed to create sockets, retrying in 5 sec... (reason: Network.Socket.socket: unsupported operation (Address family not supported by protocol))

If the NTP client fails (withNtpCheck, used by runNode' in Pos.Launcher.Scenario) then everything stops. This doesn't happen on my system. It tries to bind IPv4 and IPv6 sockets. I remember there was discussion about this last week in the slack.

The "retrying in 5 sec" message is misleading.

neongreen commented 7 years ago

@arybczak offtopic, but: any particular reason why you filed this issue here instead of YouTrack?

arybczak commented 7 years ago

@neongreen No. Should I move it there?

arybczak commented 7 years ago

It tries to bind IPv4 and IPv6 sockets. I remember there was discussion about this last week in the slack.

Right, I have ipv6 disabled in userspace (not kernel), so it might be the reason. But why did it stop working only recently?

avieth commented 7 years ago

But why did it stop working only recently?

AFAIK the ntp check was added only recently.

arybczak commented 7 years ago

I recompiled apps with ipv6 support and the error is still there.

EDIT: Alright, turns out I didn't have ipv6 support in the kernel. After enabling it it works. The thing is that I see no reason for it to fail if ipv6 is not supported, something is not right there.

neongreen commented 7 years ago

No. Should I move it there?

Preferably yes.

gromakovsky commented 7 years ago

And preferably call it e. g. Node doesn't work without ipv6 support, since now there are more details about the issue.

chshersh commented 6 years ago

What is the ID of this issue on YouTrack? Can we close this issue on GitHub?