libp2p / go-libp2p-pubsub

The PubSub implementation for go-libp2p
https://github.com/libp2p/specs/tree/master/pubsub
Other
317 stars 182 forks source link

support waiting until a gossipsub topic has enough peers #454

Closed mvdan closed 2 years ago

mvdan commented 2 years ago

Right now, most tests that build a gossipsub mesh use sleeps to wait for the mesh to build:

https://github.com/libp2p/go-libp2p-pubsub/blob/7ef0669764b94ed11e8b5b0128ee408942ebe5eb/gossipsub_test.go#L62-L63

The indexer-reference-provider codebase has the same problem:

https://github.com/filecoin-project/indexer-reference-provider/blob/3ec1151ca682a437c6e38727aa00b1087e5a0276/core/engine/engine_test.go#L186-L188

It seems like it would be enough to wait for enough peers to be connected before publishing advertisements. Right now, the following options are available:

1) Sleep. This is what all the tests do, but one has to use a very long sleep to reduce the chances of flakes. Even with multi-second sleeps, the CI over at indexer-reference-provider still ran into failures at times.

2) Poll PubSubRouter.EnoughPeers until it returns true. Would work, but just like any other form of polling, it's a tradeoff between slowness and overhead.

3) Poll Topic.ListPeers until its length reaches a minimum. Same polling problems as 1.

4) Use WithReadiness when publishing. Unfortunately, this option only works when WithDiscovery is used, and the indexer doesn't use that. The option seems to be a no-op in those cases.

I think we should support some way to do this without sleeping nor polling, nor without requiring WithDiscovery. Here are two proposed API additions:

5) Modify WithReadiness so that it works in all configurations. Without WithDiscovery, it would simply make Topic.Publish block until enough peers are connected (or until Publish's ctx is cancelled).

6) Add another method, such as func (*Topic) WaitEnoughPeers(context.Context, int) bool, which would block until enough peers are connected or ctx is cancelled.

5 seems preferrable to 6, for the sake of making existing APIs more consistently useful and not adding redundant APIs. However, I'm not familiar enough with WithReadiness to tell if it's the best approach internally.

vyzo commented 2 years ago

Thanks for the analysis.

I think that the best option is to generalize with readiness (5). The implementation would be pretty much the same in (5) and (6) internally.

On Thu, Sep 23, 2021, 15:07 Daniel Martí @.***> wrote:

Right now, most tests that build a gossipsub mesh use sleeps to wait for the mesh to build:

https://github.com/libp2p/go-libp2p-pubsub/blob/7ef0669764b94ed11e8b5b0128ee408942ebe5eb/gossipsub_test.go#L62-L63

The indexer-reference-provider codebase has the same problem:

https://github.com/filecoin-project/indexer-reference-provider/blob/3ec1151ca682a437c6e38727aa00b1087e5a0276/core/engine/engine_test.go#L186-L188

It seems like it would be enough to wait for enough peers to be connected before publishing advertisements. Right now, the following options are available:

1.

Sleep. This is what all the tests do, but one has to use a very long sleep to reduce the chances of flakes. Even with multi-second sleeps, the CI over at indexer-reference-provider still ran into failures at times. 2.

Poll PubSubRouter.EnoughPeers until it returns true. Would work, but just like any other form of polling, it's a tradeoff between slowness and overhead. 3.

Poll Topic.ListPeers until its length reaches a minimum. Same polling problems as 1. 4.

Use WithReadiness when publishing. Unfortunately, this option only works when WithDiscovery is used, and the indexer doesn't use that. The option seems to be a no-op in those cases.

I think we should support some way to do this without sleeping nor polling, nor without requiring WithDiscovery. Here are two proposed API additions:

1.

Modify WithReadiness so that it works in all configurations. Without WithDiscovery, it would simply make Topic.Publish block until enough peers are connected (or until Publish's ctx is cancelled). 2.

Add another method, such as func (*Topic) WaitEnoughPeers(context.Context, int) bool, which would block until enough peers are connected or ctx is cancelled.

5 seems preferrable to 6, for the sake of making existing APIs more consistently useful and not adding redundant APIs. However, I'm not familiar enough with WithReadiness to tell if it's the best approach internally.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/libp2p/go-libp2p-pubsub/issues/454, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAI4SRJ42DB4D2IZLOW2UTUDMKBPANCNFSM5ETQZHTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

mvdan commented 2 years ago

Sounds good, thanks. I'll update the existing PR to do 5 and let you know.

mvdan commented 2 years ago

I'm not sure how I can implement 5 in a way that doesn't poll.

With func WithReadiness(ready RouterReady) PubOpt, I need to supply a func(rt PubSubRouter, topic string) (bool, error). There's MinTopicSize, but that uses EnoughPeers, so that would get us back to polling at intervals, I think.

I see Topic has its own events like PeerJoin and PeerLeave, but if all I have is PubSubRouter and the topic string, it's not clear to me how to then make use of the Topic events.

I might just implement this with polling PubSubRouter.EnoughPeers for now, to get the API right, but it's certainly not the right internal implementation. Any hints appreciated, because I'm not sure how to do that.

vyzo commented 2 years ago

Well it's an internal detail, so an initial implementation that polls is acceptable.

On Fri, Sep 24, 2021, 17:13 Daniel Martí @.***> wrote:

I'm not sure how I can implement 5 in a way that doesn't poll.

With func WithReadiness(ready RouterReady) PubOpt, I need to supply a func(rt PubSubRouter, topic string) (bool, error). There's MinTopicSize, but that uses EnoughPeers, so that would get us back to polling at intervals, I think.

I see Topic has its own events like PeerJoin and PeerLeave, but if all I have is PubSubRouter and the topic string, it's not clear to me how to then make use of the Topic events.

I might just implement this with polling PubSubRouter.EnoughPeers for now, to get the API right, but it's certainly not the right internal implementation. Any hints appreciated, because I'm not sure how to do that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/libp2p/go-libp2p-pubsub/issues/454#issuecomment-926659074, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAI4SS7FNFLB7I5D4NRRWLUDSBQRANCNFSM5ETQZHTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

vyzo commented 2 years ago

note that you can get the topic handle from the topic string by querying the pubsub object; check the implementation of Publish.

On Fri, Sep 24, 2021, 17:17 Dimitris Vyzovitis @.***> wrote:

Well it's an internal detail, so an initial implementation that polls is acceptable.

On Fri, Sep 24, 2021, 17:13 Daniel Martí @.***> wrote:

I'm not sure how I can implement 5 in a way that doesn't poll.

With func WithReadiness(ready RouterReady) PubOpt, I need to supply a func(rt PubSubRouter, topic string) (bool, error). There's MinTopicSize, but that uses EnoughPeers, so that would get us back to polling at intervals, I think.

I see Topic has its own events like PeerJoin and PeerLeave, but if all I have is PubSubRouter and the topic string, it's not clear to me how to then make use of the Topic events.

I might just implement this with polling PubSubRouter.EnoughPeers for now, to get the API right, but it's certainly not the right internal implementation. Any hints appreciated, because I'm not sure how to do that.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/libp2p/go-libp2p-pubsub/issues/454#issuecomment-926659074, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAI4SS7FNFLB7I5D4NRRWLUDSBQRANCNFSM5ETQZHTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

mvdan commented 2 years ago

Hmm, I can't really see how I can obtain a *Topic from a PubSubRouter. *PubSub does have all the topics, but RouterReady doesn't have a *PubSub.

I've pushed a polling version of MinTopicSize at https://github.com/libp2p/go-libp2p-pubsub/pull/452, and switched a few tests over to it. Notice the TODO added to one of the tests, because I think I'm getting confused there.

vyzo commented 2 years ago

doesn't the option itself have visibility? You should be able to thread it in.

On Fri, Sep 24, 2021, 17:53 Daniel Martí @.***> wrote:

Hmm, I can't really see how I can obtain a Topic from a PubSubRouter. PubSub does have all the topics, but RouterReady doesn't have a *PubSub.

I've pushed a polling version of MinTopicSize at #452 https://github.com/libp2p/go-libp2p-pubsub/pull/452, and switched a few tests over to it. Notice the TODO added to one of the tests, because I think I'm getting confused there.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/libp2p/go-libp2p-pubsub/issues/454#issuecomment-926691356, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAI4SXPFJYLBU6ELIT7PA3UDSGIFANCNFSM5ETQZHTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

vyzo commented 2 years ago

also note that there are some thread safety issues. I am not sure whether EnoughPeers is safe to call outside the event loop, so you might need the pubsub object to use the eval channel to jump into the event loop to call it. I'll have to double check that.

On Fri, Sep 24, 2021, 17:57 Dimitris Vyzovitis @.***> wrote:

doesn't the option itself have visibility? You should be able to thread it in.

On Fri, Sep 24, 2021, 17:53 Daniel Martí @.***> wrote:

Hmm, I can't really see how I can obtain a Topic from a PubSubRouter. PubSub does have all the topics, but RouterReady doesn't have a *PubSub.

I've pushed a polling version of MinTopicSize at #452 https://github.com/libp2p/go-libp2p-pubsub/pull/452, and switched a few tests over to it. Notice the TODO added to one of the tests, because I think I'm getting confused there.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/libp2p/go-libp2p-pubsub/issues/454#issuecomment-926691356, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAI4SXPFJYLBU6ELIT7PA3UDSGIFANCNFSM5ETQZHTQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

vyzo commented 2 years ago

Yeah, it's definitely not thread-safe.

So the way to go here would be to extend the signature of the option to receive the pubsub object as well, so that you can use it.

Then, if we end up polling (again, perfectly acceptable for initial implementation) use this contraption to call EnoughPeers in a safe context:

res := make(chan bool, 1)
ps.eval <- func() { res <- ps.rt.EnoughPeers() }