Open rklaehn opened 5 years ago
Any recommendations on how to debug this? Add logging? Ask the DHT directly? We can do this the next time we see this....
This looks like https://github.com/libp2p/go-libp2p-pubsub/issues/128.
Actually, does disconnecting and reconnecting not work?
Regardless, next time you see this, please take a goroutine dump:
curl 'http://127.0.0.1:5001/debug/pprof/goroutine?debug=2'
That way we can check for any obvious deadlocks.
We were finally able to get this info. goroutine.txt
IPFS version is 0.4.19, x86 32bit on android
I don't see any obvious deadlocks. Can you run ipfs swarm peers --streams
(and tell me which peer IDs you expect to participate in pubsub)?
@rklaehn are you still seeing this problem?
I am seeing this on latest rc of 0.12.
The problem is exactly as described. 1 node is listening on channel, other node does not have it in peer list for a long time like 1-2 minutes, then the node appears in peer list, it gets sent a few messages and it drops from the peer list again.
Its so unreliable that its not really usable.
@vans163
ipfs
in commandline, or via JS-based client in a web browser?)@vans163
- how are you subscribing to the topic? (
ipfs
in commandline, or via JS-based client in a web browser?)- how many topics to you use?
//node1
ipfs pubsub sub foo1
//node2
ipfs pubsub peers foo1
ipfs pubsub pub foo1 hi
//worked
//wait few minutes
ipfs pubsub peers foo1
//No peer in list we manually add node
ipfs swarm connect /ipfs/12D3KooWPsiXD2DNPAePYHEDoABuoovDtnRCwW8Jfjk3eab496gi
// works now
ipfs pubsub peers foo1
It would receive for 1-2 minutes then drop the peer from the peerlist. (I am using --profile server), both peers are located on public facing ips (no need for hole punching / no router / no firewall) in datacenters across the globe.
Its just randomly getting dropped.
Version information:
Type:
bug
Description:
We have a number of nodes communicating via pubsub. They are a mixture of 0.4.17 and 0.4.18. Sometimes a node goes into a state where pubsub stops working. On node
A
there is plenty of pubsub traffic on topictopic
. On node B, which is a peer toA
, pubsub on the same topic is silent.ipfs pubsub peers
is empty and remains empty even when tryingipfs pubsub sub --discover <topic>
. The only thing that gets out of this state is to restart the ipfs daemon.