Closed deepfire closed 4 years ago
After myself trying to make cardano-byron-proxy
serve a static chain (with no new block announcements), @avieth suggested the following:
Byron proxy just plays the cardano-sl game: it can't download a chain unless it has the header hash of the tip. If you want it to download from a Byron peer that has a static chain, it can be done without much difficulty. Either:
- you know the hash of the tip that you want, and you can patch byron-proxy to request it in particular or
- patch cardano-sl to announce its tip header periodically even if it does not change
So, the latest status is -- as per latest developments in https://github.com/input-output-hk/iohk-ops/tree/serge/cardano-cluster :
So the next piece of work is trying to figure out how to make the legacy cluster start in OBFT, without.
There was a discussion on how to simplify #2 -- the mixed cluster, to try avoiding the issue with cardano-sl
cluster not starting with multiple nodes sharing a single localhost
address.
The idea was to use an existing mainnet
cluster as the source of blocks (which are necessary for the proxy to function, as per above).
Sadly, this breaks on two points (and a half):
There is a simpler option to try with cardano-sl
potentially being stuck due to all nodes sharing localhost
-- we can employ VDE[1] to give distinct nodes distinct, routable IP addresses.
--
The VDE route almost worked.. except the routing itself became interesting -- the kernel was choosing the same route for all packets, since all tapX
interfaces are local! ..
..and this follows to the same dreaded error as with the previous attempt with using different loopback addresses -- the network-transport-tcp
sees a mismatch between stated and actual address, and fails: https://github.com/input-output-hk/network-transport-tcp/blob/2634e5e32178bb0456d800d133f8664321daa2ef/src/Network/Transport/TCP.hs#L1621
Duh! Should have expected that..
So I'm currently playing with source routing policies, which would make the kernel assign choose different interfaces, that would actually depend on the source address: https://www.tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.rpdb.simple.html
UPDATE: I'm getting different source addresses now, however the problem now is, the mapping between TAP interfaces and the source addresses seems random :joy:
network-transport-tcp
address check did the trick -- the nodes agreed to connect/talk to each other.However, that didn't resolve the problem with the cardano-sl
nodes not making blocks.
So I started looking into switching the legacy nodes into OBFT node right from start (they currently start in Ouroboros Classic mode).
Found the OBFT era being determined by the unlockStakeEpoch
field of BlockVersionData
: https://github.com/input-output-hk/cardano-sl/blob/master/chain/src/Pos/Chain/Update/BlockVersionData.hs#L148
Regenerated genesis with unlockStakeEpoch
being equal to the magic OBFT value -- and no MPC messages appear in cardano-sl's logs anymore, which suggests the change was effective.
No blocks, though..
Ok, I've gone with the supposedly well-oiled AWS setup of cadano-sl
, however, it somehow manages to fare even worse than a cluster confined to a multi-node-on-single-machine (although, yes, there are other differences -- because the single-machine cluster required systemd service instancing and a lot of fiddling in general).
The error cardano-sl
gives at cluster startup is (with some initial context):
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node:Info:ThreadId 132] [2019-10-16 16:27:22.26 UTC] Application: cardano-sl:1, last known block version 0.2.0, systemTag: linux64
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node:Info:ThreadId 132] [2019-10-16 16:27:22.26 UTC] Genesis stakeholders (7 addresses, dust threshold 7 coin(s)): GenesisWStakeholders: {33111eddbb08270d: 1, 540fb9f1c0415491: 1, 6132662df7ccd698: 1, 773d6255ced70494: 1, 8dba875898ab11ac: 1, f7dedd2205451763: 1, f825bd9e9df8670d: 1}
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node:Info:ThreadId 132] [2019-10-16 16:27:22.26 UTC] GenesisDelegation (stakeholder ids): [773d6255ced70494 -> d5f8ce7d1937176c, 33111eddbb08270d -> aa84a9d0f69f2493, 8dba875898ab11ac -> ac68bdca1fae8f14, f7dedd2205451763 -> fcb3a4f1b35e5868, 540fb9f1c0415491 -> 98ca509664413dbf, 6132662df7ccd698 -> f050f7380f318dd4, f825bd9e9df8670d -> f3b7b1477a80fda3]
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node:Info:ThreadId 132] [2019-10-16 16:27:22.26 UTC] First genesis block hash: 1a28c5b6d7b98239, genesis seed is 76617361206f7061736120736b6f766f726f64612047677572646120626f726f64612070726f766f6461
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node:Info:ThreadId 132] [2019-10-16 16:27:22.26 UTC] Current tip header: GenesisBlockHeader:
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: hash: 1a28c5b6d7b982396995008f856640cc68fbaf923ddbde42ac232b69d972863c
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: previous block: 41a0739cb8cf98a176a990f8a90b2ca616e5413e2377d6c84841c46b5b6026b0
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: epoch: #0
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: difficulty: 0
Oct 16 16:27:22 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node:Info:ThreadId 132] [2019-10-16 16:27:22.26 UTC] Waiting 303 seconds for system start
...
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node.slotting:Notice:ThreadId 149] [2019-10-16 16:32:26.00 UTC] New slot has just started: 0th slot of 0th epoch
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node.slotting:Debug:ThreadId 149] [2019-10-16 16:32:26.00 UTC] Waiting for 19993571mcs before new slot
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node:Debug:ThreadId 142] [2019-10-16 16:32:26.00 UTC] Our tip header: GenesisBlockHeader:
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: hash: 1a28c5b6d7b982396995008f856640cc68fbaf923ddbde42ac232b69d972863c
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: previous block: 41a0739cb8cf98a176a990f8a90b2ca616e5413e2377d6c84841c46b5b6026b0
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: epoch: #0
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: difficulty: 0
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node:Info:ThreadId 142] [2019-10-16 16:32:26.00 UTC] Difference between current slot and tip slot is: 0
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.node:Debug:ThreadId 138] [2019-10-16 16:32:26.00 UTC] There are no new confirmed update proposals for our application
Oct 16 16:32:26 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.MonadPseudoRandom:Error:ThreadId 148] [2019-10-16 16:32:26.00 UTC] rollbackSsc: most genesis block is passed to rollback
Oct 16 16:32:51 c-b-1 3n073v230xhgd46jpz4zf1n32xcqzc4c-unit-script-cardano-node-legacy-start[2961]: [cardano-sl.consolidate:Error:ThreadId 119] [2019-10-16 16:32:51.27 UTC] DBMalformed "Can't retrieve genesis block, maybe db is not initialized?"
There is a lead, of course..
For the sake of completeness -- the way genesis is generated is via https://github.com/input-output-hk/cardano-sl/blob/master/scripts/prepare-genesis/default.nix
@deepfire can we close this?
@Jimbo4350, I don't think so -- not all of the bullet items are done.
will be moved to cardano-benchmarking
Context
We want cluster-based integration tests for the node.
Current scope (to be extended)
cardano-sl
cluster startupcardano-byron-proxy
test:Implementation
NixOS tests that can run a cluster in a VM are a good foundation for many of those.
This basis functionality was merged in https://github.com/input-output-hk/cardano-node/pull/177