waku-org / go-waku

Go implementation of Waku v2 protocol
https://waku.org/
Other
117 stars 42 forks source link

bug: subscription on non supported shard has too many logs #1240

Open kaichaosun opened 3 hours ago

kaichaosun commented 3 hours ago

Describe the bug

I have created a community on shard 100, I see following logs flux in the log file,

DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/peermanager/peer_selection.go:147]                      discovering peers by pubsubTopic         pubsubTopics=[/waku/2/rs/16/100]
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/peermanager/peer_selection.go:158]                      could not retrieve random peer from slot 
ERROR[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:352]                          peer selection returned err              error="no suitable peers found"
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/api/filter/filter.go:177]                               failed to resubscribe for filter         apisub-id=a9ce44a8-a0b8-4dc7-ad2e-049a94c95b3c content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26/waku/1/0xab10d2a9/rfc26 ] }" error="no suitable peers found"
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/api/filter/filter.go:143]                               subscription status                      apisub-id=a9ce44a8-a0b8-4dc7-ad2e-049a94c95b3c content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26 ] }" sub-count=0 content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26/waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26 ] }"
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/api/filter/filter.go:167]                               subscribing again                        apisub-id=a9ce44a8-a0b8-4dc7-ad2e-049a94c95b3c content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26 ] }" num-peers=2
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:339]                          handleFilterSubscribeOptions             peerCount=2 excludePeersLen=0
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/peermanager/peer_selection.go:193]                      Select Peers                             selectionCriteria="{\"selectionType\":0,\"protocolId\":\"/vac/waku/filter-subscribe/2.0.0-beta1\",\"pubsubTopics\":[\"/waku/2/rs/16/100\"],\"specificPeers\":null,\"maxPeerCount\":2,\"excludePeers\":null}" excludedPeers=
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/peermanager/peer_selection.go:147]                      discovering peers by pubsubTopic         pubsubTopics=[/waku/2/rs/16/100]
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/peermanager/peer_selection.go:158]                      could not retrieve random peer from slot 
ERROR[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:352]                          peer selection returned err              error="no suitable peers found"
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/api/filter/filter.go:177]                               failed to resubscribe for filter         apisub-id=a9ce44a8-a0b8-4dc7-ad2e-049a94c95b3c content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26 ] }" error="no suitable peers found"
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/api/filter/filter.go:143]                               subscription status                      apisub-id=a9ce44a8-a0b8-4dc7-ad2e-049a94c95b3c content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26 ] }" sub-count=0 content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26 ] }"
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/api/filter/filter.go:167]                               subscribing again                        apisub-id=a9ce44a8-a0b8-4dc7-ad2e-049a94c95b3c content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26/waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26 ] }" num-peers=2
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:339]                          handleFilterSubscribeOptions             peerCount=2 excludePeersLen=0
DEBUG[10-15|08:00:32.757|github.com/waku-org/go-waku/waku/v2/peermanager/peer_selection.go:193]                      Select Peers                             selectionCriteria="{\"selectionType\":0,\"protocolId\":\"/vac/waku/filter-subscribe/2.0.0-beta1\",\"pubsubTopics\":[\"/waku/2/rs/16/100\"],\"specificPeers\":null,\"maxPeerCount\":2,\"excludePeers\":null}" excludedPeers=
DEBUG[10-15|08:00:32.758|github.com/waku-org/go-waku/waku/v2/peermanager/peer_selection.go:147]                      discovering peers by pubsubTopic         pubsubTopics=[/waku/2/rs/16/100]
DEBUG[10-15|08:00:32.758|github.com/waku-org/go-waku/waku/v2/peermanager/peer_selection.go:158]                      could not retrieve random peer from slot 
ERROR[10-15|08:00:32.758|github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:352]                          peer selection returned err              error="no suitable peers found"
DEBUG[10-15|08:00:32.758|github.com/waku-org/go-waku/waku/v2/api/filter/filter.go:177]                               failed to resubscribe for filter         apisub-id=a9ce44a8-a0b8-4dc7-ad2e-049a94c95b3c content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26 ] }" error="no suitable peers found"
DEBUG[10-15|08:00:32.758|github.com/waku-org/go-waku/waku/v2/api/filter/filter.go:143]                               subscription status                      apisub-id=a9ce44a8-a0b8-4dc7-ad2e-049a94c95b3c content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26 ] }" sub-count=0 content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26/waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26 ] }"
DEBUG[10-15|08:00:32.758|github.com/waku-org/go-waku/waku/v2/api/filter/filter.go:167]                               subscribing again                        apisub-id=a9ce44a8-a0b8-4dc7-ad2e-049a94c95b3c content-filter="{ pubsubTopic: /waku/2/rs/16/100, contentTopics: [ /waku/1/0xf10d7f81/rfc26/waku/1/0x8e0036ca/rfc26/waku/1/0xab10d2a9/rfc26/waku/1/0x080c8cce/rfc26/waku/1/0x4aca1e0f/rfc26/waku/1/0x4ab79b45/rfc26 ] }" num-peers=2
DEBUG[10-15|08:00:32.758|github.com/waku-org/go-waku/waku/v2/protocol/filter/client.go:339]                          handleFilterSubscribeOptions             peerCount=2 excludePeersLen=0
DEBUG[10-15|08:00:32.758|github.com/waku-org/go-waku/waku/v2/peermanager/peer_selection.go:193]                      Select Peers                             selectionCriteria="{\"selectionType\":0,\"protocolId\":\"/vac/waku/filter-subscribe/2.0.0-beta1\",\"pubsubTopics\":[\"/waku/2/rs/16/100\"],\"specificPeers\":null,\"maxPeerCount\":2,\"excludePeers\":null}" excludedPeers=

There seems some unnecessary loops for this scenario. Using light mode.

cc @chaitanyaprem

chaitanyaprem commented 3 hours ago

There seems some unnecessary loops for this scenario.

yep, which is why i had added a simple backoff in case of such errors. it should try 3 times and then wait for 5 secs before trying again. looks like you have no peers with this shard and hence causing this. The idea of sharded communities as i understand is that all shards would be supported by fleet nodes to begin with which means shards would be configured in fleet nodes before being used in the app.

We can't stop retrying for a specific error in a p2p network since we don't know if the condition may change in some time. but any suggestions are welcome.

kaichaosun commented 2 hours ago

it should try 3 times and then wait for 5 secs before trying again.

By looking into the log, the times is more than 3, the log file is basically not usable in this case.

chaitanyaprem commented 2 hours ago

it should try 3 times and then wait for 5 secs before trying again.

By looking into the log, the times is more than 3, the log file is basically not usable in this case.

hope you are using latest version of status-go because the fix has been merged only last week. can you also give an idea of how may time do you see the retry? also note that this can happen if a good number of subscriptions are made for same shard as well, because there may be multiple filter batches that me doing this.

kaichaosun commented 1 hour ago

@chaitanyaprem I probably used an old version.

Here is the log, https://drive.google.com/file/d/1oG7sGVKfVIDt23TaBPvd2mzm6eZH0K6p/view?usp=sharing Nearly all the content is about resubscribe.

chaitanyaprem commented 1 hour ago

@chaitanyaprem I probably used an old version.

Here is the log, https://drive.google.com/file/d/1oG7sGVKfVIDt23TaBPvd2mzm6eZH0K6p/view?usp=sharing Nearly all the content is about resubscribe.

Yep, better use the latest..I remember this was already reported from mobile team.