libp2p / py-libp2p

The Python implementation of the libp2p networking stack 🐍 [under development]
https://libp2p.io
Other
481 stars 105 forks source link

Infinite exception loop after some time #319

Open moshemalawach opened 5 years ago

moshemalawach commented 5 years ago

When I run the app for a while, with a lot of connect, disconnect and stream closed, I run into an infinite exception loop there:

alephtest_1     |   File "/usr/local/lib/python3.7/asyncio/streams.py", line 339, in drain
alephtest_1     |     raise exc
alephtest_1     |   File "/app/py-libp2p/libp2p/stream_muxer/mplex/mplex.py", line 172, in write_to_stream
alephtest_1     |     await self.secured_conn.write(_bytes)
alephtest_1     |   File "/app/py-libp2p/libp2p/security/insecure/transport.py", line 38, in write
alephtest_1     |     await self.conn.write(data)
alephtest_1     |   File "/app/py-libp2p/libp2p/network/connection/raw_connection.py", line 39, in write
alephtest_1     |     await self.writer.drain()
alephtest_1     |   File "/usr/local/lib/python3.7/asyncio/streams.py", line 339, in drain
alephtest_1     |     raise exc
alephtest_1     |   File "/app/py-libp2p/libp2p/stream_muxer/mplex/mplex.py", line 172, in write_to_stream
alephtest_1     |     await self.secured_conn.write(_bytes)
alephtest_1     |   File "/app/py-libp2p/libp2p/security/insecure/transport.py", line 38, in write
alephtest_1     |     await self.conn.write(data)
alephtest_1     |   File "/app/py-libp2p/libp2p/network/connection/raw_connection.py", line 39, in write
alephtest_1     |     await self.writer.drain()
alephtest_1     |   File "/usr/local/lib/python3.7/asyncio/streams.py", line 339, in drain
alephtest_1     |     raise exc
alephtest_1     |   File "/app/py-libp2p/libp2p/stream_muxer/mplex/mplex.py", line 172, in write_to_stream
alephtest_1     |     await self.secured_conn.write(_bytes)
alephtest_1     |   File "/app/py-libp2p/libp2p/security/insecure/transport.py", line 38, in write
alephtest_1     |     await self.conn.write(data)
alephtest_1     |   File "/app/py-libp2p/libp2p/network/connection/raw_connection.py", line 39, in write
alephtest_1     |     await self.writer.drain()
alephtest_1     |   File "/usr/local/lib/python3.7/asyncio/streams.py", line 339, in drain
alephtest_1     |     raise exc
mhchia commented 5 years ago

@moshemalawach Thanks for the report! However, so far I couldn't tell the cause of this issue. Could you provide what commit are you on? Also, that will be great if there is a code snippet which can reproduce the issue.

moshemalawach commented 5 years ago

I am on the last of master: ec015b5a00d13522ff763fb636c3882a9008af67

It's a bit hard, as it looks like a race condition between my protocol and the gossipsub one. I have found a way to have it stop by doing a stream.reset() anytime I get a StreamError on a stream.

But I now have this issue very often (so could be linked. redacted ip from it):


2019-10-01 10:12:27 [ERROR] P2P.peers: Can't publish alive message
Traceback (most recent call last):
  File "/home/jon/virtualenvs/lp2p/py-libp2p/libp2p/network/stream/net_stream.py", line 59, in write
    return await self.muxed_stream.write(data)
  File "/home/jon/virtualenvs/lp2p/py-libp2p/libp2p/stream_muxer/mplex/mplex_stream.py", line 145, in write
    raise MplexStreamClosed(f"cannot write to closed stream: data={data}")
libp2p.stream_muxer.mplex.exceptions.MplexStreamClosed: cannot write to closed stream: data=b'\xd0\x01\x12\xcd\x01\n"\x12 \xa7J\xb1\xf0\xcaj\x84@\xe1\xa27\xe2\x0bv\x15t;\xe2\x13!\x0e\x9c\x8d\x13{\xb0[\x06\x1bA\xfcY\x12\x95\x01{"address":"/ip4/*.*.*.*/tcp/4010/p2p/QmZbgrb4z1aqKVsi32haE6JbQDqq49mEbW7VKdFCPQDagG","interests":null,"version":"0.0.post0.dev5+g8f7c4e8.dirty"}\x1a\x08\x15\xc9E\x0b\xe5\xfc\xb3\xc7"\x05ALIVE'

Traceback (most recent call last):
  File "/home/jon/virtualenvs/nuls/pyaleph/src/aleph/services/p2p/peers.py", line 26, in publish_host
    await psub.publish(topic, msg)
  File "/home/jon/virtualenvs/lp2p/py-libp2p/libp2p/pubsub/pubsub.py", line 410, in publish
    await self.push_msg(self.host.get_id(), msg)
  File "/home/jon/virtualenvs/lp2p/py-libp2p/libp2p/pubsub/pubsub.py", line 482, in push_msg
    await self.router.publish(msg_forwarder, msg)
  File "/home/jon/virtualenvs/lp2p/py-libp2p/libp2p/pubsub/floodsub.py", line 89, in publish
    await stream.write(encode_varint_prefixed(rpc_msg.SerializeToString()))
  File "/home/jon/virtualenvs/lp2p/py-libp2p/libp2p/network/stream/net_stream.py", line 61, in write
    raise StreamClosed from error
libp2p.network.stream.exceptions.StreamClosed
moshemalawach commented 5 years ago

Could be linked to #306

mhchia commented 5 years ago

But I now have this issue very often (so could be linked. redacted ip from it):

From the logs, I can only tell the stream is closed already, which means either this stream has been closed(our side has closed the write end) or reset(stream.reset is called by either us or the other side). I will investigate this issue, probably from #306 which you suggested. 😃