Closed andrejpodzimek closed 2 years ago
Simple questions:
cardano-node
and use this feature?P2P related information found on https://roadmap.cardano.org/.
Timeline:
P2P testnet updates
This week, the team completed the second milestone of the P2P deployment, delivering an engineering testnet, which allows for automatic peer selection in the network. During this stage, the team tested and implemented different user configurations, established interoperability between legacy and P2P nodes, and produced a video that reflects automated peer selection.
The team had a call with SPOs, where they explained P2P project goals, P2P system design, and the concept of hot, cold, and warm peers. They also introduced the goals of the third milestone in P2P deployment (semi-public testnet), explaining that there will be a switch in the node to enable selection of either the new P2P mode or the existing (non-P2P) one. During the semi-public testnet delivery, the team will be inviting a small group of SPOs to help test system functionality.
This week, the team worked on a P2P and non-P2P diffusion API, fixed some issues, and worked on server tests and scheduling within io-sim.
The team fixed the stateTVar signature, worked on simultaneous TCP connections opened by the handshake protocol, and rebased p2p-master branches.
June 11, 2021
This week, the team worked on the P2P master branch, updated the cardano-ping protocol in line with keep-alive protocol changes, and worked on clean connection shutdown properties. They added a missing API to io-sim-classes and also worked on cardano-cli with the Alonzo team.
June 18, 2021
This week, the team continued working on P2P testnet functionality, including switching between P2P and non-P2P networks, diagnosis of deadlock events in the connection manager, enhanced connection shutdown properties, and strict TVar interface.
June 25, 2021
This week, the team developed a reviewable version of the P2P switch feature. They also merged a clean connection shutdown PR (connection-manager part), rebased the p2p-master branch on top of multiplexer clean connection shutdown -and tested it in combination-, and worked on different schedules of io-sim.
July 2, 2021:
This week, the team worked on the P2P switch feature, which is now in review, improved some logging properties, and cleaned up the P2P master branch in the cardano-node GitHub repository.
July 9, 2021:
This week, the team completed the integration of the P2P switch feature, which allows SPOs to run a node either in P2P mode or with statically configured peers. They worked on error notifications when supplying a wrong topology file, improved logging JSON instances, and made improvements to the cardano-node p2p-master branch.
July 16, 2021:
This week, the team upgraded the P2P switch, worked on network tracers, fixed some tests, and merged the server simulation. They are now in the process of running simulation tests.
July 23, 2021:
This week, the team fixed some tests in the cardano-node repository, made logging improvements, and made changes to the P2P master branch.
July 30, 2021:
The team also restructured the P2P to non-P2P switch for better clarity and easier maintenance of the data diffusion API.
August 20, 2021:
This week, the team worked on the P2P master branch to implement support for node v.1.28.0.
August 27, 2021:
This week, the team continued working on network simulations, resolved some Cardano node issues, and rebased the P2P master branch on top of the Cardano node v.1.28.0. They are now in the process of testing the P2P suite.
This is the last update I could find. I still dont know after fetching this information if the feature is included or not in 1.28 or 1.29, testnet, or mainnet, and what the name of the P2P config switch will be, but this is progress. Let read the code in the p2p-master
branch and find out.
Finally, there is also this issue discussed on the forum where the current topology update of api.clio.one and the issues it causes for Kubernetes
As of today, the p2p-master
branch does not seem to be merged in a stable cardano-node release yet:
This branch is 14 commits ahead, 257 commits behind master.
This README file contains information on the new topology file format we can expect, but I guess it may still change in the future if P2P testing is still ongoing. I did not find a reference to the EnableP2P config switch in the documentation. Missing in the docs?
Hey, I just released Helm Charts to run cardano node containers 🐳 in Kubernetes.
It does solve the Peer to Peer topology update by repeating this process every 24 hours:
Everything in this implementation is fully autonomous 🚗, and fully decentralised since it runs locally in the cluster. The topology update process is fair and transparent. Topology is updated and discovered automatically from the blockchain data itself and nothing else!
Link to the code on Github
Link to the peer to peer extension: Github
This is available in (not yet supported) p2p version, and it will not be backported to non-p2p nodes.
External
Area Other Any other topic (Delegation, Ranking, ...).
Describe the feature you'd like While running
topologyUpdater.sh
hourly, it is recommended to restart the relay node each time to pick possible topology changes. This is a big problem, because the node startup takes more than 10 minutes (fromsystemctl restart
to the point when the socket appears and communicates), despite the fact that my SSDs can serve around 5 GB/s. (The CPU is a bottleneck during initialization.)This would lead to a downtime 1/6 of the time. In a stake pool configuration this is bad, because one could miss block minting or the like. It should be safe to just run
topologyUpdater.sh
hourly (as recommended) and tellcardano-node
to reload the topology each time, ideally without downtime (as long as the topology doesn’t change dramatically).Describe alternatives you've considered I considered two relay nodes on the same machine and on two different ports for redundancy, so that one is always up, even when one is being reloaded. Unfortunately, this would come with storage and computational overhead and there doesn’t seem to be a guarantee that block minting wouldn’t be missed with one of the relays down. (If there is such a guarantee, then this needs to be documented, i.e., how exactly having multiple hosts and ports could help.)