libp2p / rust-libp2p

The Rust Implementation of the libp2p networking stack.
https://libp2p.io
MIT License
4.3k stars 892 forks source link

`libp2p_stream` stream deadlocks if using yamux 0.13 #5410

Open indietyp opened 1 month ago

indietyp commented 1 month ago

Summary

It seems that the yamux multiplexer deadlocks if two streams are trying to change the window size at the same time. This is a problem with yamux 0.13 and yamux 0.12 WindowUpdateMode::OnRead. Changing the multiplexer to mplex, setting it to WindowUpdateMode::OnReceive, or switching to QUIC solves the problem. However, due to constraints in the system I am working on, either of these solutions is undesirable.

The implementation where this behavior has been observed uses libp2p_stream and splits the resulting stream into a read-and-write half. This behavior has been observed when splitting AsyncRead/AsyncWrite as well as when trying to split into separate Stream/Sink after running the libp2p::Stream through a tokio-util codec.

This problem is observed when we AsyncWrite a large(-ish) (about 700KiB) quantity of data into the AsyncWrite end; the AsyncWrite handle then never terminates the .await and is stuck forever.

The Stream/Sink split is done using the following snippet of code:

let stream = stream.compat();
let stream = BufStream::new(stream);
let stream = Framed::new(stream, ServerCodec::new());

let (sink, stream) = stream.split();

Expected behavior

Using the yamux multiplexer does not deadlock.

Actual behavior

Using the yamux multiplexer deadlocks the connection.

Relevant log output

2024-05-21T08:30:40.231871Z  INFO libp2p_swarm: local_peer_id=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR
2024-05-21T08:30:40.231935Z DEBUG libp2p_core::transport::choice: Failed to listen on address using libp2p_core::transport::dummy::DummyTransport<(libp2p_identity::peer_id::PeerId, libp2p_core::muxing::boxed::StreamMuxerBox)> address=/ip4/127.0.0.1/tcp/0
2024-05-21T08:30:40.231963Z DEBUG libp2p_tcp: listening on address address=127.0.0.1:0
2024-05-21T08:30:40.232097Z DEBUG Swarm::poll: libp2p_swarm: New listener address listener=ListenerId(1) address=/ip4/127.0.0.1/tcp/61693
2024-05-21T08:30:40.232117Z DEBUG harpc_service::transport::task: received swarm event event=NewListenAddr { listener_id: ListenerId(1), address: "/ip4/127.0.0.1/tcp/61693" }
2024-05-21T08:30:40.232127Z  INFO harpc_service::transport::task: listening on address address=/ip4/127.0.0.1/tcp/61693
2024-05-21T08:30:40.232260Z  INFO libp2p_swarm: local_peer_id=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc
2024-05-21T08:30:40.232289Z  INFO harpc_service::transport::task: dialing peer address=/ip4/127.0.0.1/tcp/61693
2024-05-21T08:30:40.232302Z DEBUG libp2p_core::transport::choice: Failed to dial address using libp2p_core::transport::dummy::DummyTransport<(libp2p_identity::peer_id::PeerId, libp2p_core::muxing::boxed::StreamMuxerBox)> address=/ip4/127.0.0.1/tcp/61693
2024-05-21T08:30:40.232306Z DEBUG libp2p_tcp: dialing address address=127.0.0.1:61693
2024-05-21T08:30:40.232618Z DEBUG Transport::dial{address=/ip4/127.0.0.1/tcp/61693}: multistream_select::dialer_select: Dialer: Proposed protocol: /noise
2024-05-21T08:30:40.232625Z DEBUG Transport::dial{address=/ip4/127.0.0.1/tcp/61693}: multistream_select::dialer_select: Dialer: Expecting proposed protocol: /noise
2024-05-21T08:30:40.232631Z DEBUG Swarm::poll: libp2p_tcp: Incoming connection from remote at local remote_address=/ip4/127.0.0.1/tcp/61694 local_address=/ip4/127.0.0.1/tcp/61693
2024-05-21T08:30:40.232659Z DEBUG harpc_service::transport::task: received swarm event event=IncomingConnection { connection_id: ConnectionId(2), local_addr: "/ip4/127.0.0.1/tcp/61693", send_back_addr: "/ip4/127.0.0.1/tcp/61694" }
2024-05-21T08:30:40.232793Z DEBUG new_incoming_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2}: multistream_select::listener_select: Listener: confirming protocol: /noise
2024-05-21T08:30:40.232809Z DEBUG new_incoming_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2}: multistream_select::listener_select: Listener: sent confirmed protocol: /noise
2024-05-21T08:30:40.232999Z DEBUG Transport::dial{address=/ip4/127.0.0.1/tcp/61693}: multistream_select::negotiated: Negotiated: Received confirmation for protocol: /noise
2024-05-21T08:30:40.233202Z DEBUG Transport::dial{address=/ip4/127.0.0.1/tcp/61693}: multistream_select::dialer_select: Dialer: Proposed protocol: /yamux/1.0.0
2024-05-21T08:30:40.233206Z DEBUG Transport::dial{address=/ip4/127.0.0.1/tcp/61693}: multistream_select::dialer_select: Dialer: Expecting proposed protocol: /yamux/1.0.0
2024-05-21T08:30:40.233213Z DEBUG Transport::dial{address=/ip4/127.0.0.1/tcp/61693}: yamux::connection: new connection: 7decdc85 (Client)
2024-05-21T08:30:40.233233Z DEBUG Swarm::poll: libp2p_stream::shared: Creating new channel pair peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR connection=1
2024-05-21T08:30:40.233267Z DEBUG Swarm::poll: libp2p_swarm: Connection established peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR endpoint=Dialer { address: "/ip4/127.0.0.1/tcp/61693", role_override: Dialer } total_peers=1
2024-05-21T08:30:40.233280Z DEBUG harpc_service::transport::task: received swarm event event=ConnectionEstablished { peer_id: PeerId("12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR"), connection_id: ConnectionId(1), endpoint: Dialer { address: "/ip4/127.0.0.1/tcp/61693", role_override: Dialer }, num_established: 1, concurrent_dial_errors: Some([]), established_in: 882.5µs }
2024-05-21T08:30:40.233305Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: yamux::connection::rtt: sending ping 2874317839
2024-05-21T08:30:40.233327Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: yamux::connection: 7decdc85: new outbound (Stream 7decdc85/1) of (Connection 7decdc85 Client (streams 0))
2024-05-21T08:30:40.233336Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::dialer_select: Dialer: Proposed protocol: /ipfs/ping/1.0.0
2024-05-21T08:30:40.233346Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: yamux::connection: 7decdc85: new outbound (Stream 7decdc85/3) of (Connection 7decdc85 Client (streams 1))
2024-05-21T08:30:40.233351Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::dialer_select: Dialer: Proposed protocol: /ipfs/id/1.0.0
2024-05-21T08:30:40.233426Z DEBUG new_incoming_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2}: multistream_select::listener_select: Listener: confirming protocol: /yamux/1.0.0
2024-05-21T08:30:40.233443Z DEBUG new_incoming_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2}: multistream_select::listener_select: Listener: sent confirmed protocol: /yamux/1.0.0
2024-05-21T08:30:40.233450Z DEBUG new_incoming_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2}: yamux::connection: new connection: 796f5a01 (Server)
2024-05-21T08:30:40.233471Z DEBUG Swarm::poll: libp2p_stream::shared: Creating new channel pair peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc connection=2
2024-05-21T08:30:40.233483Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::negotiated: Negotiated: Received confirmation for protocol: /yamux/1.0.0
2024-05-21T08:30:40.233542Z DEBUG Swarm::poll: libp2p_swarm: Connection established peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc endpoint=Listener { local_addr: "/ip4/127.0.0.1/tcp/61693", send_back_addr: "/ip4/127.0.0.1/tcp/61694" } total_peers=1
2024-05-21T08:30:40.233556Z DEBUG harpc_service::transport::task: received swarm event event=ConnectionEstablished { peer_id: PeerId("12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc"), connection_id: ConnectionId(2), endpoint: Listener { local_addr: "/ip4/127.0.0.1/tcp/61693", send_back_addr: "/ip4/127.0.0.1/tcp/61694" }, num_established: 1, concurrent_dial_errors: None, established_in: 810.375µs }
2024-05-21T08:30:40.233577Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: yamux::connection::rtt: sending ping 3820378387
2024-05-21T08:30:40.233600Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: yamux::connection: 796f5a01: new outbound (Stream 796f5a01/2) of (Connection 796f5a01 Server (streams 1))
2024-05-21T08:30:40.233610Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::dialer_select: Dialer: Proposed protocol: /ipfs/ping/1.0.0
2024-05-21T08:30:40.233619Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: yamux::connection: 796f5a01: new outbound (Stream 796f5a01/4) of (Connection 796f5a01 Server (streams 3))
2024-05-21T08:30:40.233624Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::dialer_select: Dialer: Proposed protocol: /ipfs/id/1.0.0
2024-05-21T08:30:40.233638Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::listener_select: Listener: confirming protocol: /ipfs/ping/1.0.0 
2024-05-21T08:30:40.233642Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::listener_select: Listener: sent confirmed protocol: /ipfs/ping/1.0.0
2024-05-21T08:30:40.233652Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: yamux::connection::rtt: received pong 2874317839, estimated round-trip-time 346.375µs
2024-05-21T08:30:40.233667Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::listener_select: Listener: confirming protocol: /ipfs/id/1.0.0   
2024-05-21T08:30:40.233670Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::listener_select: Listener: confirming protocol: /ipfs/ping/1.0.0 
2024-05-21T08:30:40.233683Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::listener_select: Listener: sent confirmed protocol: /ipfs/ping/1.0.0
2024-05-21T08:30:40.233674Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::listener_select: Listener: sent confirmed protocol: /ipfs/id/1.0.0
2024-05-21T08:30:40.233723Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::listener_select: Listener: confirming protocol: /ipfs/id/1.0.0   
2024-05-21T08:30:40.233728Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::listener_select: Listener: sent confirmed protocol: /ipfs/id/1.0.0
2024-05-21T08:30:40.233747Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: yamux::connection::rtt: received pong 3820378387, estimated round-trip-time 169.708µs
2024-05-21T08:30:40.233766Z DEBUG harpc_service::transport::task: received swarm event event=Behaviour(TransportBehaviourEvent: Sent { peer_id: PeerId("12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc") })
2024-05-21T08:30:40.233780Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::dialer_select: Dialer: Received confirmation for protocol: /ipfs/ping/1.0.0
2024-05-21T08:30:40.233801Z DEBUG harpc_service::transport::task: received swarm event event=Behaviour(TransportBehaviourEvent: Sent { peer_id: PeerId("12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR") })
2024-05-21T08:30:40.233808Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::dialer_select: Dialer: Received confirmation for protocol: /ipfs/ping/1.0.0
2024-05-21T08:30:40.233822Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::dialer_select: Dialer: Received confirmation for protocol: /ipfs/id/1.0.0
2024-05-21T08:30:40.233838Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::dialer_select: Dialer: Received confirmation for protocol: /ipfs/id/1.0.0
2024-05-21T08:30:40.233875Z DEBUG harpc_service::transport::task: received swarm event event=Behaviour(TransportBehaviourEvent: Received { peer_id: PeerId("12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc"), info: Info { public_key: PublicKey { publickey: Ed25519(PublicKey(compressed): de974d995bb01e9877b71ded6db4192831eb9da24eada9d2e62f8d9ae71931) }, protocol_version: "/harpc/1.0.0", agent_version: "rust-libp2p/0.44.2", listen_addrs: [], protocols: ["/ipfs/id/1.0.0", "/ipfs/ping/1.0.0", "/ipfs/id/push/1.0.0"], observed_addr: "/ip4/127.0.0.1/tcp/61693" } })
2024-05-21T08:30:40.233901Z DEBUG harpc_service::transport::task: received swarm event event=NewExternalAddrCandidate { address: "/ip4/127.0.0.1/tcp/61693" }
2024-05-21T08:30:40.233877Z DEBUG harpc_service::transport::task: received swarm event event=Behaviour(TransportBehaviourEvent: Received { peer_id: PeerId("12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR"), info: Info { public_key: PublicKey { publickey: Ed25519(PublicKey(compressed): c63a6cc32d4e23a74fe264e19267161dffaae6b24217677dc37d65dccbce) }, protocol_version: "/harpc/1.0.0", agent_version: "rust-libp2p/0.44.2", listen_addrs: ["/ip4/127.0.0.1/tcp/61693"], protocols: ["/ipfs/id/1.0.0", "/harpc/1.0.0", "/ipfs/id/push/1.0.0", "/ipfs/ping/1.0.0"], observed_addr: "/ip4/127.0.0.1/tcp/61694" } })
2024-05-21T08:30:40.233916Z DEBUG harpc_service::transport::task: received swarm event event=NewExternalAddrOfPeer { peer_id: PeerId("12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR"), address: "/ip4/127.0.0.1/tcp/61693" }
2024-05-21T08:30:40.233927Z  INFO harpc_service::transport::task: discovered external address of peer peer_id=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR address=/ip4/127.0.0.1/tcp/61693
2024-05-21T08:30:40.233939Z DEBUG harpc_service::transport::task: received swarm event event=NewExternalAddrCandidate { address: "/ip4/127.0.0.1/tcp/61694" }
2024-05-21T08:30:40.233955Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: libp2p_ping::handler: ping succeeded rtt=156.292µs
2024-05-21T08:30:40.233963Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: libp2p_ping::handler: ping succeeded rtt=141.458µs
2024-05-21T08:30:40.233972Z DEBUG harpc_service::transport::task: received swarm event event=Behaviour(TransportBehaviourEvent: Event { peer: PeerId("12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR"), connection: ConnectionId(1), result: Ok(141.458µs) })
2024-05-21T08:30:40.233956Z DEBUG libp2p_stream::control: Requesting new stream peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR
2024-05-21T08:30:40.233988Z DEBUG libp2p_stream::shared: Returning sender to existing connection
2024-05-21T08:30:40.233973Z DEBUG harpc_service::transport::task: received swarm event event=Behaviour(TransportBehaviourEvent: Event { peer: PeerId("12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc"), connection: ConnectionId(2), result: Ok(156.292µs) })
2024-05-21T08:30:40.234011Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: yamux::connection: 7decdc85: new outbound (Stream 7decdc85/5) of (Connection 7decdc85 Client (streams 2))
2024-05-21T08:30:40.234019Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::dialer_select: Dialer: Proposed protocol: /harpc/1.0.0
2024-05-21T08:30:40.234085Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::listener_select: Listener: confirming protocol: /harpc/1.0.0
2024-05-21T08:30:40.234091Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61694 id=2 peer=12D3KooWQoGVRr1WrH91Vk6gc2w5ujHPppB5GBnFpHANG3Gg9nSc}:Connection::poll: multistream_select::listener_select: Listener: sent confirmed protocol: /harpc/1.0.0 
2024-05-21T08:30:40.234159Z DEBUG new_established_connection{remote_addr=/ip4/127.0.0.1/tcp/61693 id=1 peer=12D3KooWPAAZgEWWEHyscRFRmP1ZEPidbeCsJULxcKBZHx8pfNNR}:Connection::poll: multistream_select::dialer_select: Dialer: Received confirmation for protocol: /harpc/1.0.0
2024-05-21T08:30:40.235088Z DEBUG harpc_service::session::gc: running garbage collector
2024-05-21T08:30:40.235094Z DEBUG harpc_service::session::gc: running garbage collector
2024-05-21T08:30:40.235739Z DEBUG yamux::connection::stream::flow_control: old window_max: 0.25 mb, new window_max: 0.5 mb
2024-05-21T08:30:40.238609Z DEBUG yamux::connection::stream::flow_control: old window_max: 0.25 mb, new window_max: 0.5 mb

After this, the connection gets stuck, and no progress is made on said connection, but libp2p_ping and yamux::connection::rtt seem to continue to work and progress.

Possible Solution

No response

Version

libp2p = { version = "0.53.2", features = ["metrics", "macros", "tcp", "noise", "yamux", "ping", "tokio", "identify", "quic"] }
libp2p-stream = "0.1.0-alpha"

Would you like to work on fixing this bug ?

No

dariusc93 commented 1 month ago

Hey! Thanks for the bug report. I noticed this last night when transmitting payloads more than 400KiB (though 512KiB or more can trigger it). This is with or without splitting the stream. Havent looked into it more myself but it does appear to only happen with libp2p-stream and not with other behaviours, but you did mention about it happening with yamux 0.13 or on 0.12 with WindowUpdateMode::OnRead so it might be a combo of both?

/CC @mxinden

indietyp commented 1 month ago

I retraced the issue yesterday, I think. I think this has something to do with https://docs.rs/libp2p-yamux/latest/libp2p_yamux/struct.WindowUpdateMode.html#method.on_read (see the warning). The possibility of the deadlock occurring was presumably removed in https://github.com/libp2p/rust-yamux/pull/177, but maybe that wasn't the complete reason for the deadlock? Might also be related to something completely different.

dariusc93 commented 1 month ago

I retraced the issue yesterday, I think. I think this has something to do with https://docs.rs/libp2p-yamux/latest/libp2p_yamux/struct.WindowUpdateMode.html#method.on_read (see the warning). The possibility of the deadlock occurring was presumably removed in libp2p/rust-yamux#177, but maybe that wasn't the complete reason for the deadlock? Might also be related to something completely different.

Hmm very possible. I dont think ive ever experience a deadlock with yamux so its hard to really say. I know with WindowUpdateMode::on_read that there was a change prior to https://github.com/libp2p/rust-yamux/pull/177 that suppose to make the dead lock comment irrelevant. I may try to reproduce it outside of libp2p-stream.

diegomrsantos commented 2 weeks ago

@dariusc93 could you please take a look at https://github.com/libp2p/rust-yamux/pull/193 ?