status-im / status-desktop

Status Desktop client made in Nim & QML
https://status.app
Mozilla Public License 2.0
276 stars 76 forks source link

Ping-pong loop if HandleCommunityRequestToJoinResponse returns an error #15466

Open mprakhov opened 2 weeks ago

mprakhov commented 2 weeks ago

Bug Report

Description

Precondition: Community with minted owner token. Member A is a TokenMaster

  1. Member A sends the request to join to the control node (message 1)
  2. The control node receives a request to join from member A (message 1) handles it and sends a response to member A (message 2)
  3. Member A while handling the request to join response (message 2) returned an error (it could happen, if CommunityDescription update will be received before request to join response)

If this sequence is met, we will have a ping pong loop with steps 2, 3

Repro steps are unclear so far. Was not able to reproduce it again, but have 2 accounts with stable reproduce The interesting part is that for step 3 the control node doesn't need to be online. I have a loop of receiving community request to join response Step 2 requires member A node to be online

Steps to reproduce

Unknown

See a description

Additional Information

mprakhov commented 2 weeks ago

@qfrank could it be somehow related to the https://github.com/status-im/status-go/pull/4969 PR?

mprakhov commented 2 weeks ago

Callstack for step 3

github.com/status-im/status-go/protocol.(*Messenger).handleCommunityRequestToJoinResponseProtobuf
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger_handlers.go:1143
github.com/status-im/status-go/protocol.(*Messenger).dispatchToHandler
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger_handlers.go:155
github.com/status-im/status-go/protocol.(*Messenger).handleRetrievedMessages
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger.go:3969
github.com/status-im/status-go/protocol.(*Messenger).RetrieveAll
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger.go:3411
github.com/status-im/status-go/protocol.(*Messenger).ProcessAllMessages
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger.go:3432
github.com/status-im/status-go/protocol.(*Messenger).StartRetrieveMessagesLoop.func1
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger.go:3423
mprakhov commented 2 weeks ago

callstack for step 2

github.com/status-im/status-go/protocol.(*Messenger).handleCommunityRequestToJoinProtobuf
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger_handlers.go:710
github.com/status-im/status-go/protocol.(*Messenger).dispatchToHandler
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger_handlers.go:92
github.com/status-im/status-go/protocol.(*Messenger).handleRetrievedMessages
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger.go:3969
github.com/status-im/status-go/protocol.(*Messenger).RetrieveAll
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger.go:3411
github.com/status-im/status-go/protocol.(*Messenger).ProcessAllMessages
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger.go:3432
github.com/status-im/status-go/protocol.(*Messenger).StartRetrieveMessagesLoop.func1
    /Users/mprakhov/Work/status/status-desktop/vendor/status-go/protocol/messenger.go:3423
runtime.goexit
mprakhov commented 2 weeks ago

I found out that TestMessageResend test from https://github.com/status-im/status-go/pull/4969 is failing after fixing the bug in test utils https://github.com/status-im/status-go/pull/5383/commits/324b9b349f1ee4d9d49e4515185d2a18cae36c0

mprakhov commented 2 weeks ago

logs

geth_token_master.log

geth_control_node.log

mprakhov commented 2 weeks ago

@richard-ramos could you please help with analyzing this bug... or at least give some hints where and what to look at

richard-ramos commented 2 weeks ago

By ping pong loop, you mean that the request to join in step 2 is sent again??

mprakhov commented 2 weeks ago

By ping pong loop, you mean that the request to join in step 2 is sent again??

Yes, but only when member A is online. Step 3 is sending on a node startup (after 15-30 seconds), even if control node is offline

igor-sirotin commented 2 weeks ago

Looks like I'm facing something similar.

Not sure what exactly caused the issues but here's what I was doing.

  1. Alice: create a non-token gated community (with request to join)
  2. Bob: join community
  3. Charlie: join community

After this I was playing around with Bob/Charlie to join and leave the community a few times.

At some point I noted this loop. This bunch of messages appear each second in the control node logs:

DBG 2024-07-08 17:49:46.759+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324060 file=threadpool.nim:56 messageType=AsyncGetRevealedAccountsArg:ObjectType threadid=15324060 task="{\"$type\":\"AsyncGetRevealedAccountsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"memberPubkey\":\"\",\"vptr\":4504893696,\"slot\":\"onAsyncGetRevealedAccountsForAllMembersCompleted\"}"
DBG 2024-07-08 17:49:46.759+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324056 file=threadpool.nim:56 messageType=GetCommunityTokensDetailsArg:ObjectType threadid=15324056 task="{\"$type\":\"GetCommunityTokensDetailsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"vptr\":4504897920,\"slot\":\"onCommunityTokensDetailsLoaded\"}"
DBG 2024-07-08 17:49:46.759+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324060 file=core.nim:27 rpc_method=wakuext_getRevealedAccountsForAllMembers
DBG 2024-07-08 17:49:46.759+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324056 file=core.nim:27 rpc_method=wakuext_getCommunityTokens
DBG 2024-07-08 17:49:46.761+01:00 setCommunityInfo                           tid=15324005 file=status_community_link_preview.nim:155 communityId=0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e communityName=bubego-open
DBG 2024-07-08 17:49:46.763+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324058 file=threadpool.nim:56 messageType=GetCommunityTokensDetailsArg:ObjectType threadid=15324058 task="{\"$type\":\"GetCommunityTokensDetailsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"vptr\":4504897920,\"slot\":\"onCommunityTokensDetailsLoaded\"}"
DBG 2024-07-08 17:49:46.763+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324057 file=threadpool.nim:56 messageType=AsyncGetRevealedAccountsArg:ObjectType threadid=15324057 task="{\"$type\":\"AsyncGetRevealedAccountsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"memberPubkey\":\"\",\"vptr\":4504893696,\"slot\":\"onAsyncGetRevealedAccountsForAllMembersCompleted\"}"
DBG 2024-07-08 17:49:46.763+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324058 file=core.nim:27 rpc_method=wakuext_getCommunityTokens
DBG 2024-07-08 17:49:46.763+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324057 file=core.nim:27 rpc_method=wakuext_getRevealedAccountsForAllMembers
DBG 2024-07-08 17:49:46.764+01:00 setCommunityInfo                           tid=15324005 file=status_community_link_preview.nim:155 communityId=0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e communityName=bubego-open
ERR 2024-07-08 17:49:46.764+01:00 Unknown request                            topics="community-service" tid=15324005 file=service.nim:793 msg="Community request not found: 0xe8bd42112f8a625cee72b25cea766f08c3e9ebb788305d249ddfff1e780d7d20"
DBG 2024-07-08 17:49:46.765+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324063 file=threadpool.nim:56 messageType=AsyncGetRevealedAccountsArg:ObjectType threadid=15324063 task="{\"$type\":\"AsyncGetRevealedAccountsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"memberPubkey\":\"\",\"vptr\":4504893696,\"slot\":\"onAsyncGetRevealedAccountsForAllMembersCompleted\"}"
DBG 2024-07-08 17:49:46.764+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324055 file=threadpool.nim:56 messageType=GetCommunityTokensDetailsArg:ObjectType threadid=15324055 task="{\"$type\":\"GetCommunityTokensDetailsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"vptr\":4504897920,\"slot\":\"onCommunityTokensDetailsLoaded\"}"
DBG 2024-07-08 17:49:46.765+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324055 file=core.nim:27 rpc_method=wakuext_getCommunityTokens
DBG 2024-07-08 17:49:46.765+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324063 file=core.nim:27 rpc_method=wakuext_getRevealedAccountsForAllMembers
WRN 2024-07-08 17:49:46.765+01:00 trying to get chat data for an unexisting chat id topics="chat-service" tid=15324005 file=service.nim:357 chatId=
DBG 2024-07-08 17:49:46.765+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_hasUnseenActivityCenterNotifications
DBG 2024-07-08 17:49:46.766+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_activityCenterNotificationsCount
DBG 2024-07-08 17:49:46.766+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_activityCenterNotificationsCount
DBG 2024-07-08 17:49:46.766+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_hasUnseenActivityCenterNotifications
DBG 2024-07-08 17:49:46.767+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_activityCenterNotificationsCount

It repeats each second:

DBG 2024-07-08 17:49:47.754+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324062 file=threadpool.nim:56 messageType=GetCommunityTokensDetailsArg:ObjectType threadid=15324062 task="{\"$type\":\"GetCommunityTokensDetailsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"vptr\":4504897920,\"slot\":\"onCommunityTokensDetailsLoaded\"}"
DBG 2024-07-08 17:49:47.754+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324059 file=threadpool.nim:56 messageType=AsyncGetRevealedAccountsArg:ObjectType threadid=15324059 task="{\"$type\":\"AsyncGetRevealedAccountsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"memberPubkey\":\"\",\"vptr\":4504893696,\"slot\":\"onAsyncGetRevealedAccountsForAllMembersCompleted\"}"
DBG 2024-07-08 17:49:47.754+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324062 file=core.nim:27 rpc_method=wakuext_getCommunityTokens
DBG 2024-07-08 17:49:47.754+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324059 file=core.nim:27 rpc_method=wakuext_getRevealedAccountsForAllMembers
DBG 2024-07-08 17:49:47.756+01:00 setCommunityInfo                           tid=15324005 file=status_community_link_preview.nim:155 communityId=0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e communityName=bubego-open
DBG 2024-07-08 17:49:47.760+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324061 file=threadpool.nim:56 messageType=GetCommunityTokensDetailsArg:ObjectType threadid=15324061 task="{\"$type\":\"GetCommunityTokensDetailsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"vptr\":4504897920,\"slot\":\"onCommunityTokensDetailsLoaded\"}"
DBG 2024-07-08 17:49:47.760+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324056 file=threadpool.nim:56 messageType=AsyncGetRevealedAccountsArg:ObjectType threadid=15324056 task="{\"$type\":\"AsyncGetRevealedAccountsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"memberPubkey\":\"\",\"vptr\":4504893696,\"slot\":\"onAsyncGetRevealedAccountsForAllMembersCompleted\"}"
DBG 2024-07-08 17:49:47.760+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324061 file=core.nim:27 rpc_method=wakuext_getCommunityTokens
DBG 2024-07-08 17:49:47.760+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324056 file=core.nim:27 rpc_method=wakuext_getRevealedAccountsForAllMembers
DBG 2024-07-08 17:49:47.762+01:00 setCommunityInfo                           tid=15324005 file=status_community_link_preview.nim:155 communityId=0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e communityName=bubego-open
ERR 2024-07-08 17:49:47.762+01:00 Unknown request                            topics="community-service" tid=15324005 file=service.nim:793 msg="Community request not found: 0xe8bd42112f8a625cee72b25cea766f08c3e9ebb788305d249ddfff1e780d7d20"
DBG 2024-07-08 17:49:47.762+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324060 file=threadpool.nim:56 messageType=GetCommunityTokensDetailsArg:ObjectType threadid=15324060 task="{\"$type\":\"GetCommunityTokensDetailsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"vptr\":4504897920,\"slot\":\"onCommunityTokensDetailsLoaded\"}"
DBG 2024-07-08 17:49:47.762+01:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=15324058 file=threadpool.nim:56 messageType=AsyncGetRevealedAccountsArg:ObjectType threadid=15324058 task="{\"$type\":\"AsyncGetRevealedAccountsArg:ObjectType\",\"communityId\":\"0x0210af0e662d4229b4700cd68412da0bc6fe6636a0e7ea63230c32cc111019bd1e\",\"memberPubkey\":\"\",\"vptr\":4504893696,\"slot\":\"onAsyncGetRevealedAccountsForAllMembersCompleted\"}"
DBG 2024-07-08 17:49:47.762+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324060 file=core.nim:27 rpc_method=wakuext_getCommunityTokens
DBG 2024-07-08 17:49:47.762+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324058 file=core.nim:27 rpc_method=wakuext_getRevealedAccountsForAllMembers
WRN 2024-07-08 17:49:47.762+01:00 trying to get chat data for an unexisting chat id topics="chat-service" tid=15324005 file=service.nim:357 chatId=
DBG 2024-07-08 17:49:47.763+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_hasUnseenActivityCenterNotifications
DBG 2024-07-08 17:49:47.763+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_activityCenterNotificationsCount
DBG 2024-07-08 17:49:47.763+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_activityCenterNotificationsCount
DBG 2024-07-08 17:49:47.764+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_hasUnseenActivityCenterNotifications
DBG 2024-07-08 17:49:47.764+01:00 NewBE_callPrivateRPC                       topics="rpc" tid=15324005 file=core.nim:27 rpc_method=wakuext_activityCenterNotificationsCount

From geth.log I can see that the community description is updated and published each second 👀

richard-ramos commented 2 weeks ago

cc: @kaichaosun I think this is related to the message confirmation PR

jrainville commented 1 week ago

@mprakhov I assigned you to try to fix it. You can sync with @richard-ramos and @kaichaosun to see if they know more.

This issue might be related to the store nodes getting full nowadays. Apparently, something is generating 8GB of messages per hours!

This would take priority over your other issues.

mprakhov commented 1 week ago

Rebased on the latest master + develop branch, RequestToJoinResponse for TM stopped coming. The control node still receives RequestToJoin messages in the loop when TM is online.

TM loop could be potentially fixed by https://github.com/status-im/status-go/pull/5497

mprakhov commented 1 week ago

From TM DB: messageID: 0x3689e89b0c7b154acee665007457ed7ef0c7e9f3f4d00eee32bf5cb91c26244d: send_count: 2 sent: 0 resend_type: 2 resend_method 1

// ResendTypeRawMessage We have a function, watchExpiredMessages, that monitors the 'raw_messages' table
// and will attempts to resend messages if a previous message sending failed.
ResendTypeRawMessage ResendType = 2

// ResendMethodSendPrivate corresponding function MessageSender#SendPrivate
ResendMethodSendPrivate ResendMethod = 1