Closed nemo83 closed 1 year ago
ALTZ pool - same issue. Additional data: the metric cardano_node_metrics_connectedPeers_int reflect the lost of connection but netstat or gLiveview show simetric outcoming and incoming connections including the 1.35.1 relay. The BP node log shows different message for 1.34.1 relay and 1.35.1 one: {"host":"ip-xxx-x","pid":"16xxx","loc":null,"at":"2022-07-22T17:34:00.81Z","ns":["cardano.node.BlockFetchDecision"],"sev":"Info","env":"1.34.1:a357c","data":{"kind":"PeersFetch","peers":[{"peer":{"remote":{ "port":"xxxx","addr":"<relay 1.35.1 node IP>"},"local":{"port":"xxxxx","addr":"<BP 1.34.1 node IP>"}},"kind":"FetchDecision declined","declined":"FetchDeclineChainNotPlausible"},{"peer":{"remote":{"port":"xxxx","addr":"<relay 1.34.1 node IP"},"local":{"port":"xxxxx","addr":"<BP 1.34.1 node IP"}},"kind":"FetchDecision results","length":"1"}]},"msg":"","thread":"xxx","app":[]} I don't know if it is related to this issue.
same issue here
I have all these issues fixed finally. But yes I have seen them...
What i had to do is make sure every relay and BP is under a separate public IP, and also had to play around with NAT settings.
I also enabled firewall rules between my relays, so my own relays never connect to each other.
Oh, sorry... this was between 1.34.1 and 1.35.1 ... I had this happening between 1.35.1 relays...
Anyway, it has been recommended to have interconnected the own relays. I don't understand how we have to avoid to connect our 1.35.x relays to each other.
Anyway, it has been recommended to have interconnected the own relays. I don't understand how we have to avoid to connect our 1.35.x relays to each other.
Well, in my test I had choice, have interconnected relays, or have relays missing from the block producer, so i decided my block producer is more important ;-)
Little off-topic question, but how can I enable cardano_node_metrics_connectedPeers_int
? I don't have such a metric in Prometheus metrics
Little off-topic question, but how can I enable
cardano_node_metrics_connectedPeers_int
? I don't have such a metric in Prometheus metrics
Hey, good question, you need to update your config.json, there is one of this flag that needs to be set to true
{
"AlonzoGenesisFile": "mainnet-alonzo-genesis.json",
"AlonzoGenesisHash": "7e94a15f55d1e82d10f09203fa1d40f8eede58fd8066542cf6566008068ed874",
"ApplicationName": "cardano-sl",
"ApplicationVersion": 1,
"ByronGenesisFile": "mainnet-byron-genesis.json",
"ByronGenesisHash": "5f20df933584822601f9e3f8c024eb5eb252fe8cefb24d1317dc3d432e940ebb",
"LastKnownBlockVersion-Alt": 0,
"LastKnownBlockVersion-Major": 3,
"LastKnownBlockVersion-Minor": 0,
"MaxKnownMajorProtocolVersion": 2,
"Protocol": "Cardano",
"RequiresNetworkMagic": "RequiresNoMagic",
"ShelleyGenesisFile": "mainnet-shelley-genesis.json",
"ShelleyGenesisHash": "1a3be38bcbb7911969283716ad7aa550250226b76a61fc51cc9a9a35d9276d81",
"TraceAcceptPolicy": false,
"TraceBlockFetchClient": false,
"TraceBlockFetchDecisions": true,
"TraceBlockFetchProtocol": false,
"TraceBlockFetchProtocolSerialised": false,
"TraceBlockFetchServer": false,
"TraceChainDb": false,
"TraceChainSyncBlockServer": false,
"TraceChainSyncClient": false,
"TraceChainSyncHeaderServer": false,
"TraceChainSyncProtocol": false,
"TraceConnectionManager": false,
"TraceDNSResolver": false,
"TraceDNSSubscription": false,
"TraceDiffusionInitialization": false,
"TraceErrorPolicy": false,
"TraceForge": true,
"TraceHandshake": false,
"TraceInboundGovernor": false,
"TraceIpSubscription": false,
"TraceLedgerPeers": false,
"TraceLocalChainSyncProtocol": false,
"TraceLocalErrorPolicy": false,
"TraceLocalHandshake": false,
"TraceLocalRootPeers": false,
"TraceLocalTxSubmissionProtocol": false,
"TraceLocalTxSubmissionServer": false,
"TraceMempool": false,
"TraceMux": false,
"TracePeerSelection": false,
"TracePeerSelectionActions": false,
"TracePublicRootPeers": false,
"TraceServer": false,
"TraceTxInbound": false,
"TraceTxOutbound": false,
"TraceTxSubmissionProtocol": false,
"TracingVerbosity": "NormalVerbosity",
"TurnOnLogMetrics": false,
"TurnOnLogging": true,
"MaxConcurrencyBulkSync": 2,
"MaxConcurrencyDeadline": 3,
"defaultBackends": [
"KatipBK"
],
"defaultScribes": [
[
"StdoutSK",
"stdout"
]
],
"hasEKG": 12788,
"hasPrometheus": [
"0.0.0.0",
12798
],
"minSeverity": "Info",
"options": {
"mapBackends": {
"cardano.node.metrics": [
"EKGViewBK"
],
"cardano.node.resources": [
"EKGViewBK"
]
},
"mapSubtrace": {
"cardano.node.metrics": {
"subtrace": "Neutral"
}
}
},
"rotation": {
"rpKeepFilesNum": 10,
"rpLogLimitBytes": 5000000,
"rpMaxAgeHours": 24
},
"setupBackends": [
"KatipBK"
],
"setupScribes": [
{
"scFormat": "ScText",
"scKind": "StdoutSK",
"scName": "stdout",
"scRotation": null
}
]
}
This one is one of my relay tracing for that metric, can't remember exactly which one it is, but I seem to remember to be "TraceBlockFetchDecisions": true,
.
Anyway make a diff with your config and you'll easily find out the one or to to change.
This should be fixed in 1.35.3 release. Is it ok to close it?
This should be fixed in 1.35.3 release. Is it ok to close it?
Doing it now. Thanks
External
Area Stake pool: network connectivity issues between BP and Relay running different versions of the node, 1.35.1 vs 1.34.1
Summary Hello, this is Giovanni and I operate EASY1 Stakepool.
I've recently replaced one of the three relays I run with the latest 1.35.1.
In the past few days I've noticed the BP repeadetly loosing connection to the relay and failing to re-enstablishing it. The only way to connect the two again is restarting the relay.
I can't tell if relay-to-relay there is a similar issue, but should there be, this could potentially affect network connectivity among Stakepool upgrading to 1.35.x, while other might be on previous version and dramatically affect block propagation.
I've found at least two more SPOs w/ same issues.
Steps to reproduce Hard to reproduce, it just happens randomly
Expected behavior BP and relay shouldn't lost connection to each other
System info (please complete the following information): BP running 1.34.1 (official docker image) Relay running 1.35.1 (official docker image)