libp2p / punchr

🥊 Components to measure Direct Connection Upgrade through Relay (DCUtR) performance.
Apache License 2.0
49 stars 11 forks source link

rust-client: Stalls after several hours #62

Open mxinden opened 1 year ago

mxinden commented 1 year ago

My rust-client, running on a Raspberry Pi B+, stalls after >10h.

The last log lines:

Dec 01 10:18:46 raspberrypi rust-client[21355]: [2022-12-01T09:18:46Z INFO  rust_client] Listening on "/ip6/xxx/udp/35616/quic"
Dec 01 10:18:46 raspberrypi rust-client[21355]: [2022-12-01T09:18:46Z INFO  rust_client] Listening on "/ip6/xxx/tcp/36013"

Before I debug this further, anyone else seeing this on their deployment?

//CC @elenaf9 @thomas-eizinger @jxs

TippyFlitsUK commented 1 year ago

I'm getting rust client crashes with the following:

[2022-12-04T19:48:10Z INFO  rust_client] dial to QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb failed: Transport([("/ip6/2604:1380:4602:5c00::3/tcp/4001/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb", Other(Custom { kind: Other, error: Transport(B(A(A(B(Os { code: 110, kind: TimedOut, message: "Connection timed out" }))))) }))])
Error: UnknownProtocolId(465)
jxs commented 1 year ago

Mine didn't stall after a couple of days but I get some crashes with UnknownProtocol:

INFO  rust_client] Info { public_key: Rsa(PublicKey(PKCS1): 30821a282110d7ab1ee4135d7cecd99635b146155d24f5c8dddfa5d9bc513a92ff37ad797a6135f149c8124c6e2e3fff416d891eaf733f554d5f3bda8c6ac1f92c7ece4fb5abe72ba413ecf298080c1dc7e6b422b385a577f4b191da528c95111c7fed7baac2aaff729b40c248b9eedf5d7691e4bd31a4f79e8312a236c1e902254143e254f9ded1d087254082812b267aaeb699e5f9337ecd86853b5367bd765cc3bf67b8c3cdedd3e799aa23a26fecdf1cc3ee2ae87df33a83ff862f17f931996a3e7811ce044f6db71ec6ecc65306f6051fefd5cdaa65b229a3b8d535c234df8f5ac1237a045c26da5b414c85357962576dba787fa49d962ca446f323101), protocol_version: "ipfs/0.1.0", agent_version: "kubo/0.16.0-rc1/d4ac65f", listen_addrs: ["/ip4/139.178.91.71/tcp/4001", "/ip6/2604:1380:45e3:6e00::1/tcp/4001", "/ip4/139.178.91.71/udp/4001/quic", "/ip6/2604:1380:45e3:6e00::1/udp/4001/quic", "/ip6/64:ff9b::8bb2:5b47/udp/4001/quic"], protocols: ["/p2p/id/delta/1.0.0", "/ipfs/id/1.0.0", "/ipfs/id/push/1.0.0", "/ipfs/ping/1.0.0", "/libp2p/circuit/relay/0.1.0", "/libp2p/circuit/relay/0.2.0/stop", "/ipfs/kad/1.0.0", "/ipfs/lan/kad/1.0.0", "/libp2p/autonat/1.0.0", "/ipfs/bitswap/1.2.0", "/ipfs/bitswap/1.1.0", "/ipfs/bitswap/1.0.0", "/ipfs/bitswap", "/x/", "/libp2p/dcutr", "/libp2p/circuit/relay/0.2.0/hop"], observed_addr: "/ip4/94.62.54.44/tcp/45145" }
Error: UnknownProtocolId(465)
mxinden commented 1 year ago

Error: UnknownProtocolId(465) should be from rust-multiaddr:

https://github.com/multiformats/rust-multiaddr/blob/master/src/errors.rs#L15

More specifically it is the webtransport protocol:

https://github.com/multiformats/multiaddr/blob/master/protocols.csv#L28

The Rust client doesn't support webtransport in the first place, thus fine to ignore the error. Added tracking issue here: https://github.com/multiformats/rust-multiaddr/issues/68

elenaf9 commented 1 year ago

I'm getting rust client crashes with the following:

[2022-12-04T19:48:10Z INFO  rust_client] dial to QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb failed: Transport([("/ip6/2604:1380:4602:5c00::3/tcp/4001/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb", Other(Custom { kind: Other, error: Transport(B(A(A(B(Os { code: 110, kind: TimedOut, message: "Connection timed out" }))))) }))])
Error: UnknownProtocolId(465)

Thanks for reporting @TippyFlitsUK!

The Rust client doesn't support webtransport in the first place, thus fine to ignore the error. Added tracking issue here: https://github.com/multiformats/rust-multiaddr/issues/68

My bad; right now the rust-client aborts the whole program if parsing a Multaddr fails. Added a PR to instead skip such an address: #64.

thomaseizinger commented 1 year ago

Note that if it is running within a systemd service or something similar, it should automatically be restarted.

elenaf9 commented 1 year ago

My rust-client, running on a Raspberry Pi B+, stalls after >10h.

The last log lines:

Dec 01 10:18:46 raspberrypi rust-client[21355]: [2022-12-01T09:18:46Z INFO  rust_client] Listening on "/ip6/xxx/udp/35616/quic"
Dec 01 10:18:46 raspberrypi rust-client[21355]: [2022-12-01T09:18:46Z INFO  rust_client] Listening on "/ip6/xxx/tcp/36013"

Before I debug this further, anyone else seeing this on their deployment?

//CC @elenaf9 @thomas-eizinger @jxs

@mxinden Did you do any further debugging on this? I don't have the issue of the client stalling (using Raspberry Pi 4 Model B), however I pipe the output into a file and I noticed that the outputs stall after some time, i.e. the client still runs and I can view the results on the dashboard, but the latest entry in the file is more than a day old.

mxinden commented 1 year ago

Still facing the issue. Haven't done any further debugging. Glad I seem to be the only one.

jxs commented 1 year ago

Sorry Max, after seeing your comment went to check again and noticed that yeah mine also seems to be stalled, no logs since a couple of days

[2022-12-10T14:03:30Z INFO  rust_client] Received { peer_id: PeerId("12D3KooWGmAkyVtRv6VKQer5JNZorty9sJQifJq6Xa7Nj7tSTzos"), info: Info { public_key: Ed25519(PublicKey(compressed): 67302a3910bc9199ca4d305774c5b6
b3fb66833f1490b31620de387048a2), protocol_version: "ipfs/0.1.0", agent_version: "go-ipfs/0.12.2/0e8b121", listen_addrs: ["/ip4/157.90.132.176/tcp/4005", "/ip4/157.90.132.176/tcp/63548", "/ip4/157.90.132.176/udp/
14648/quic", "/ip4/157.90.132.176/udp/4005/quic"], protocols: ["/p2p/id/delta/1.0.0", "/ipfs/id/1.0.0", "/ipfs/id/push/1.0.0", "/ipfs/ping/1.0.0", "/libp2p/circuit/relay/0.1.0", "/libp2p/circuit/relay/0.2.0/stop
", "/ipfs/lan/kad/1.0.0", "/libp2p/autonat/1.0.0", "/ipfs/bitswap/1.2.0", "/ipfs/bitswap/1.1.0", "/ipfs/bitswap/1.0.0", "/ipfs/bitswap", "/ipfs/kad/1.0.0", "/libp2p/circuit/relay/0.2.0/hop", "/x/"], observed_add
r: "/ip4/94.62.54.44/tcp/40871" } }
[2022-12-10T14:03:34Z INFO  rust_client] Outgoing connection error to Some(PeerId("12D3KooWGmAkyVtRv6VKQer5JNZorty9sJQifJq6Xa7Nj7tSTzos")): Transport([("/ip4/157.90.132.176/udp/4005/quic/p2p/12D3KooWGmAkyVtRv6VK
Qer5JNZorty9sJQifJq6Xa7Nj7tSTzos", Other(Custom { kind: Other, error: Transport(A(HandshakeTimedOut)) }))])
[2022-12-10T14:03:36Z INFO  rust_client] InboundCircuitReqFailed { relay_peer_id: PeerId("12D3KooWGmAkyVtRv6VKQer5JNZorty9sJQifJq6Xa7Nj7tSTzos"), error: Upgrade(Select(Failed)) }
[2022-12-10T14:03:40Z INFO  rust_client] Connection to PeerId("12D3KooWGmAkyVtRv6VKQer5JNZorty9sJQifJq6Xa7Nj7tSTzos") via Dialer { address: "/ip4/157.90.132.176/tcp/4005/p2p/12D3KooWGmAkyVtRv6VKQer5JNZorty9sJQif
Jq6Xa7Nj7tSTzos", role_override: Dialer } closed
^[5[2022-12-14T00:32:47Z INFO  rust_client] Listening on "/ip6/fe80::9c8e:6cff:fef6:22d9/udp/32938/quic"
[2022-12-14T00:32:47Z INFO  rust_client] Listening on "/ip6/fe80::9c8e:6cff:fef6:22d9/tcp/41119"
thomaseizinger commented 1 year ago

Mine is running without issues. Linux x64, started through a systemd service.

thomaseizinger commented 1 year ago

Mine is running without issues. Linux x64, started through a systemd service.

That said, mine is running on my desktop so it is regularly shut down and restarted as part of me turning my computer on and off (about ~daily) which may be why I am not experiencing this issue.