Closed lexnv closed 2 weeks ago
The CI pipeline was cancelled due to failure one of the required jobs. Job name: test-linux-stable 3/3 Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7276240
I've changed a bit two things since last review:
We got around 4.7k warnings from hickory:
WARN tokio-runtime-worker hickory_proto::xfer::dns_exchange: failed to associate send_message response to the sender"
The fix similar to: https://github.com/paritytech/substrate/pull/12253 (disable logging for this crate).
Count | Level | Triage report |
---|---|---|
232 | warn | 🥩 ran out of peers to request justif #. from num_cache=. num_live=. err=. |
4 | warn | Report .: . to .. Reason: .. Banned, disconnecting. ( Peer disconnected with inflight after backoffs. Banned, disconnecting. ) |
2 | warn | ❌ Error while dialing .: . |
1 | warn | 💔 Error importing block .: . ( Parent block of 0xd7a9…f573 has no associated weight ) |
1 | warn | Report .: . to .. Reason: .. Banned, disconnecting. ( Same block request multiple times. Banned, disconnecting. ) |
Other warnings:
"2024-09-06 15:39:22.641 WARN tokio-runtime-worker litep2p::ipfs::identify: inbound identify substream opened for peer who doesn't exist peer=PeerId(\"12D3KooWRHaoLvJuJptSUgsc1bzXsKToRUR6qS2KW1MVgnJqLpKx\") protocol=/ipfs/id/1.0.0",
"2024-09-07 20:01:07.952 WARN tokio-runtime-worker hickory_proto::xfer::dns_exchange: failed to associate send_message response to the sender",
"2024-09-07 21:43:54.157 WARN tokio-runtime-worker litep2p::transport-manager: unknown connection opened as secondary connection, discarding peer=PeerId(\"12D3KooWGTnNXimfyieaZAeyRDvZLQpFF7Nr9a8bS3oN4yMPQExZ\") connection_id=ConnectionId(2347697) address=\"/ip4/212.224.112.221/tcp/49054/ws/p2p/12D3KooWGTnNXimfyieaZAeyRDvZLQpFF7Nr9a8bS3oN4yMPQExZ\" dial_record=AddressRecord { score: 100, address: \"/ip4/212.224.112.221/tcp/30333/p2p/12D3KooWGTnNXimfyieaZAeyRDvZLQpFF7Nr9a8bS3oN4yMPQExZ\", connection_id: Some(ConnectionId(2347695)) }",
This has resurfaced the litep2p::transport-manager: unknown connection opened as secondary connection
: https://github.com/paritytech/litep2p/issues/172. Have created a new issue for this: https://github.com/paritytech/litep2p/issues/242
Count | Level | Triage report |
---|---|---|
683 | warn | Notification block pinning limit reached. Unpinning block with hash = .* |
11 | warn | Report .: . to .. Reason: .. Banned, disconnecting. ( Not requested block data. Banned, disconnecting. ) |
4 | warn | Report .: . to .. Reason: .. Banned, disconnecting. ( Unsupported protocol. Banned, disconnecting. ) |
2 | warn | Can't listen on . because: . |
1 | warn | Re-finalized block #. (.) in the canonical chain, current best finalized is #.* |
1 | warn | Report .: . to .. Reason: .. Banned, disconnecting. ( Same block request multiple times. Banned, disconnecting. ) |
1 | warn | ❌ Error while dialing .: . |
1 | warn | 💔 Error importing block .: . ( Parent block of 0xd7a9…f573 has no associated weight ) |
Other warnings:
- "2024-09-06 14:13:20.673 WARN tokio-runtime-worker sc_network::service: 💔 The bootnode you want to connect to at `/dns/ksm14.rotko.net/tcp/33224/p2p/12D3KooWAa5THTw8HPfnhEei23HdL8P9McBXdozG2oTtMMksjZkK` provided a different peer ID `12D3KooWDTWSFqWQNqHdrAc2srGsqzK7GMw9RAjFTfUjcka5FEJN` than the one you expect `12D3KooWAa5THTw8HPfnhEei23HdL8P9McBXdozG2oTtMMksjZkK`. ",
- "2024-09-06 22:58:25.843 ERROR tokio-runtime-worker sc_utils::mpsc: The number of unprocessed messages in channel `mpsc-notification-to-protocol-2-beefy` exceeded 100000.",
Manual triaging (until sub-triage-logs gains access to loki):
WARN tokio-runtime-worker babe: 👶 Epoch(s) skipped: from 33226 to 33241
2024-09-07 06:16:30.004 WARN tokio-runtime-worker parachain::dispute-coordinator: error=Runtime(RuntimeRequest(NotSupported { runtime_api_name: "candidate_events" }))
2024-09-07 06:16:30.084 WARN tokio-runtime-worker parachain::runtime-api: cannot query the runtime API version: Api called for an unknown Block: Header was not found in the database: 0x5aaa2a515394a2f9da57ab3ea792808f93822dcdcac76fd9de173776bd9d31ca api="candidate_events"
2024-09-07 06:16:59.828 WARN tokio-runtime-worker parachain::runtime-api: cannot query the runtime API version: Api called for an unknown Block: Header was not found in the database: 0x5aaa2a515394a2f9da57ab3ea792808f93822dcdcac76fd9de173776bd9d31ca api="candidate_events"
Warnings appeared after the versi-net was scaled down from 100 to 20 validators Saturaday, roughly at Sat Sep 7 06:09:42
. Warnings continued for around 1h.
This was the first time we introduced scaling in our versi-net testing, will continue to keep an eye on this and check how libp2p behaves in comparison.
This pull request has been mentioned on Polkadot Forum. There might be relevant details there:
https://forum.polkadot.network/t/litep2p-network-backend-updates/9973/1
This release introduces several new features, improvements, and fixes to the litep2p library. Key updates include enhanced error handling, configurable connection limits, and a new API for managing public addresses.
For a detailed set of changes, see litep2p changelog.
This PR makes use of:
Warp sync time improvement
Measuring warp sync time is a bit inaccurate since the network is not deterministic and we might end up using faster peers (peers with more resources to handle our requests). However, I did not see warp sync times of 16 minutes, instead, they are roughly stabilized between 8 and 10 minutes.
For measuring warp-sync time, I've used sub-trige-logs
Litep2p
Libp2p
Closes: https://github.com/paritytech/polkadot-sdk/issues/4986
Low peer count
After exposing the
litep2p::public_addresses
interface, we can report to litep2p confirmed external addresses. This should mitigate or at least improve: https://github.com/paritytech/polkadot-sdk/issues/4925. Will keep the issue around to confirm this.Improved metrics
We are one step closer to exposing similar metrics as libp2p: https://github.com/paritytech/polkadot-sdk/issues/4681.
cc @paritytech/networking
Next Steps