status-im / status-mobile

a free (libre) open source, mobile OS for Ethereum
https://status.app
Mozilla Public License 2.0
3.89k stars 985 forks source link

store-01.do-ams3.shards.test is slow #20320

Open churik opened 3 months ago

churik commented 3 months ago

Bug Report

Problem

Message reliability issues when devices are connected to store-01.do-ams3.shards.test

https://github.com/status-im/status-mobile/assets/4557972/f88290f8-1ac7-4245-a256-37ada0133448

Simple messages are sent for 100-180 sec, community requests may stuck for 40-60 mins even when the control node is online

Additional Information

Logs: IOS_logs.zip Android_logs.zip More fresh logs: logs (88).zip

Discussion is here

cammellos commented 3 months ago

Checking for the request to join 0x9a8d987074eb7b707e9f27d7c2f9a88cde406d55bda480addd8b2e9f1e9856c3:

https://kibana.infra.status.im/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15h,to:now))&_a=(columns:!(),filters:!(),index:ffcc22b0-0116-11ed-9719-cdd3b483481c,interval:auto,query:(language:kuery,query:%220x9a8d987074eb7b707e9f27d7c2f9a88cde406d55bda480addd8b2e9f1e9856c3%22),sort:!(!('@timestamp',desc)))

It has been stored and propagated to the fleet correctly.

Also sent at the same time:

https://kibana.infra.status.im/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-15h,to:now))&_a=(columns:!(),filters:!(),index:ffcc22b0-0116-11ed-9719-cdd3b483481c,interval:auto,query:(language:kuery,query:%220x343abb9d3a33b10971247ad5a44a086506107d49d93abf3033e6983c5516794b%22),sort:!(!('@timestamp',desc)))

cammellos commented 3 months ago

The message was received by desktop:

INFO [06-04|12:35:36.961|github.com/status-im/status-go/wakuv2/waku.go:1121]                                  received waku2 store message             envelopeHash=0x9a8d987074eb7b707e9f27d7c2f9a88cde406d55bda480addd8b2e9f1e9856c3 pubsubTopic=/waku/2/rs/16/64 timestamp=1,717,492,720,431,626,000

From a store node.

https://github.com/waku-org/go-waku/issues/1098#issuecomment-2133189282 it's this issue I believe

pavloburykh commented 3 months ago

@cammellos hi! Here is a fresh example from release 2.29 builds: at some point messages start being delivered with delay.

Android_messages stuck.zip IOS_messages_stuck.zip

https://github.com/status-im/status-mobile/assets/97245802/a7730f08-350b-4d92-b22d-60a64bbe7753

cammellos commented 3 months ago

@pavloburykh wonderful, thank you!

cammellos commented 3 months ago

@pavloburykh could you set the logs to debug if possible for the next time you manage to replicate? thanks!

pavloburykh commented 3 months ago

@pavloburykh could you set the logs to debug if possible for the next time you manage to replicate? thanks!

Hey @cammellos! Looks like we have some issues with Debug level in release builds. This is not not the first time when Debug level in release do not have all necessary logs. I am 100% sure I have set log level to Debug before reproducing the issue. I can log a separate issue in order we handle this release log level issue.

pavloburykh commented 3 months ago

Meanwhile, I will try to reproduce message reliability issue in PR build tomorrow and share logs here. Thank you for looking at the issue!