libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
328 stars 186 forks source link

NewStream failure right after startup #546

Closed cpacia closed 4 months ago

cpacia commented 1 year ago

This took me forever debug, but here's the issue. If you start up two nodes and immediately join a topic they will fail discover each other in the pubsub. Here's why:

Node1

Node2

Nodes 1 and 2 never connect :(

Fortunately I was able to come up with a fix that doesn't require any changes in libp2p. Basically detect the protocolupdate event and force it think it received a new connection:

        protocolUpdatedSub, err := host.EventBus().Subscribe(new(event.EvtPeerProtocolsUpdated))
    if err != nil {
        return nil, err
    }
    go func(sub event.Subscription) {
        for {
            select {
            case evt, ok := <-sub.Out():
                if !ok {
                    return
                }
                var updated bool
                for _, proto := range evt.(event.EvtPeerProtocolsUpdated).Added {
                    if proto == cfg.params.ProtocolPrefix+pubsub.GossipSubID_v11 {
                        updated = true
                        break
                    }
                }
                if updated {
                    for _, c := range host.Network().ConnsToPeer(evt.(event.EvtPeerProtocolsUpdated).Peer) {
                        (*pubsub.PubSubNotif)(ps).Connected(host.Network(), c)
                    }
                }
            }
        }
    }(protocolUpdatedSub)

But it would be nice if this was built in as it's definitely not expected behavior.

cpacia commented 1 year ago

Looks like there is an open PR attempting to address this issue.

master255 commented 4 months ago

Thank you for the tip! I ended up fixing this problem by adding a line: ps, err := pubsub.NewGossipSub(ctx, h) right after creating the host and before initializing the DHT. This registers compatible protocols. Now everything works right away.

@vyzo @Stebalien But I also think programmers should be able to start NewGossipSub (and its processes) after peers connect to each other. Now I start NewGossipSub even if the user doesn't need it at all. This should be fixed!

Maybe we shouldn't start a NewGossipSub loop without a subscription to at least one topic. Perhaps this has already been done?

vyzo commented 4 months ago

I thought we had fixed that.

Stebalien commented 4 months ago

https://github.com/libp2p/go-libp2p-pubsub/pull/564

I haven't tested this yet but it should work (assuming you're using the latest libp2p).

master255 commented 4 months ago

Waiting for the release so I can test it out

vyzo commented 4 months ago

Can you test it out before the release?

master255 commented 4 months ago

How? I can change the version of the library. In order to test before release, I need to create my own fork.

vyzo commented 4 months ago

no, you can just use @master as the version tag.

master255 commented 4 months ago

ok

master255 commented 4 months ago

Bug: Node 1: NewGossipSub Join Subscribe

Node 2: NewGossipSub Join Subscribe

Users are available, messages are being sent. All is good. Then: Both nodes: sub.Cancel() topic.Close()

Then: Node 1: NewGossipSub Join Subscribe

Node 2: NewGossipSub Join Subscribe

Node 1 - users are not present and do not appear after a time. At the same time messages are forwarded both ways. Node 2 - There is a user. Messages are forwarded. All is good.

Perhaps we need a method that will close PubSub? Otherwise right now - once started, there is no way to stop it?

Another way is to stop the PubSub loop and its heartbeat if there are no topics. I think this is a more acceptable way. As long as it doesn't create a load while there are no topics. Perhaps that's how it works already?