status-im / status-desktop

Status Desktop client made in Nim & QML
https://status.app
Mozilla Public License 2.0
273 stars 76 forks source link

Old account has message reliability issues #13710

Closed jrainville closed 3 months ago

jrainville commented 4 months ago

Bug Report

Description

edit: Even more people have experienced it now. It seems like sending message works all the time, but receiving never works. Receiving from store nodes works however. My instinct tells me there is a problem registering topics. The encryption keys seem fine, since people can still send messages and also because they receive them on restart.

@mprakhov experienced this issue. He had an old account that has a lot of old communities, some are the Status ones, others are test communities.

His peer count is ok, but he sometimes stops receiving messages reliably.

We think it might be the history archive starting (for possibly no reason). Then either the DB is hogged by the archive protocol or something gets broken.

igor-sirotin commented 4 months ago

My peers:

{
    "jsonrpc": "2.0",
    "id": 0,
    "result": {
        "16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT": {},
        "16Uiu2HAm9CQhsuwPR54q27kNj9iaQVfyRzTGKrhFmr94oD8ujU6P": {},
        "16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R": {},
        "16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT": {},
        "16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM": {},
        "16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ": {}
    }
}

@mprakhov old account peers (5 common peers with me)

{
    "jsonrpc": "2.0",
    "id": 0,
    "result": {
        "16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT": {},
        "16Uiu2HAm9CQhsuwPR54q27kNj9iaQVfyRzTGKrhFmr94oD8ujU6P": {},
        "16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R": {},
        "16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT": {},
        "16Uiu2HAmJ7BhFmGPHvj3GqjUNELDrJPBUbiUF7kyGW4aStVdhJjJ": {},
        "16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM": {},
        "16Uiu2HAmMwiWFZa4sKz2b2ghhwaCWrrwzHsWCA6YP8Spmg7rTotS": {}
    }
}

@mprakhov new account peers (2 common peers with me)

{
    "jsonrpc": "2.0",
    "id": 0,
    "result": {
        "16Uiu2HAm2u8PwVWQNiLf54FBCRxq9RkuioXwo2mq4dxpFFsvnztF": {},
        "16Uiu2HAm8mUZ18tBWPXDQsaF7PbCKYA35z7WB2xNZH2EVq1qS8LJ": {},
        "16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R": {},
        "16Uiu2HAmAR24Mbb6VuzoyUiGx42UenDkshENVDj4qnmmbabLvo31": {},
        "16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX": {},
        "16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ": {},
        "16Uiu2HAmVPATgQiPyT6deTMQzJo5b4QiaVXUKWRJMdbTLxkTrGeU": {}
    }
}
caybro commented 4 months ago

Not getting any new messages:

{"jsonrpc":"2.0","id":0,"result":{"16Uiu2HAm8mUZ18tBWPXDQsaF7PbCKYA35z7WB2xNZH2EVq1qS8LJ":{"protocols":["/ipfs/ping/1.0.0","/vac/waku/lightpush/2.0.0-beta1","/vac/waku/filter/2.0.0-beta1","/vac/waku/filter-subscribe/2.0.0-beta1","/vac/waku/peer-exchange/2.0.0-alpha1","/vac/waku/metadata/1.0.0","/vac/waku/relay/2.0.0","/libp2p/circuit/relay/0.2.0/hop","/rendezvous/1.0.0","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0"],"addresses":["/ip4/34.135.13.87/tcp/30303/p2p/16Uiu2HAm8mUZ18tBWPXDQsaF7PbCKYA35z7WB2xNZH2EVq1qS8LJ","/dns4/boot-01.gc-us-central1-a.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm8mUZ18tBWPXDQsaF7PbCKYA35z7WB2xNZH2EVq1qS8LJ"]},"16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R":{"protocols":["/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0","/vac/waku/relay/2.0.0","/rendezvous/1.0.0","/ipfs/ping/1.0.0","/vac/waku/store/2.0.0-beta4","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0"],"addresses":["/dns4/store-02.do-ams3.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R","/ip4/159.223.242.154/tcp/30303/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R","/dns4/store-02.do-ams3.shards.test.statusim.net/tcp/443/wss/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R"]},"16Uiu2HAmAR24Mbb6VuzoyUiGx42UenDkshENVDj4qnmmbabLvo31":{"protocols":["/vac/waku/peer-exchange/2.0.0-alpha1","/vac/waku/relay/2.0.0","/rendezvous/1.0.0","/ipfs/ping/1.0.0","/vac/waku/filter/2.0.0-beta1","/vac/waku/filter-subscribe/2.0.0-beta1","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0","/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0","/vac/waku/lightpush/2.0.0-beta1"],"addresses":["/ip4/167.99.19.47/tcp/30303/p2p/16Uiu2HAmAR24Mbb6VuzoyUiGx42UenDkshENVDj4qnmmbabLvo31","/dns4/boot-01.do-ams3.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmAR24Mbb6VuzoyUiGx42UenDkshENVDj4qnmmbabLvo31"]},"16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX":{"protocols":["/vac/waku/filter-subscribe/2.0.0-beta1","/libp2p/autonat/1.0.0","/rendezvous/1.0.0","/vac/waku/filter/2.0.0-beta1","/vac/waku/relay/2.0.0","/ipfs/ping/1.0.0","/vac/waku/lightpush/2.0.0-beta1","/vac/waku/peer-exchange/2.0.0-alpha1","/ipfs/id/1.0.0","/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0"],"addresses":["/ip4/8.218.23.76/tcp/30303/p2p/16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX","/dns4/boot-01.ac-cn-hongkong-c.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX"]},"16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM":{"protocols":["/vac/waku/relay/2.0.0","/rendezvous/1.0.0","/ipfs/ping/1.0.0","/vac/waku/store/2.0.0-beta4","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0","/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0"],"addresses":["/dns4/store-02.gc-us-central1-a.shards.test.statusim.net/tcp/443/wss/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM","/dns4/store-02.gc-us-central1-a.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM","/ip4/34.170.154.2/tcp/30303/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM"]},"16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ":{"protocols":["/rendezvous/1.0.0","/ipfs/ping/1.0.0","/vac/waku/store/2.0.0-beta4","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0","/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0","/vac/waku/relay/2.0.0"],"addresses":["/ip4/34.170.192.39/tcp/30303/p2p/16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ","/dns4/store-01.gc-us-central1-a.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ","/dns4/store-01.gc-us-central1-a.shards.test.statusim.net/tcp/443/wss/p2p/16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ"]}}}
caybro commented 4 months ago

Take 2, working "fine":

{"jsonrpc":"2.0","id":0,"result":{"16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT":{"protocols":["/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0","/vac/waku/relay/2.0.0","/rendezvous/1.0.0","/ipfs/ping/1.0.0","/vac/waku/store/2.0.0-beta4","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0"],"addresses":["/dns4/store-01.ac-cn-hongkong-c.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT","/ip4/8.218.74.73/tcp/30303/p2p/16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT","/dns4/store-01.ac-cn-hongkong-c.shards.test.statusim.net/tcp/443/wss/p2p/16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT"]},"16Uiu2HAm9CQhsuwPR54q27kNj9iaQVfyRzTGKrhFmr94oD8ujU6P":{"protocols":["/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0","/vac/waku/relay/2.0.0","/rendezvous/1.0.0","/ipfs/ping/1.0.0","/vac/waku/store/2.0.0-beta4","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0"],"addresses":["/ip4/8.218.121.232/tcp/30303/p2p/16Uiu2HAm9CQhsuwPR54q27kNj9iaQVfyRzTGKrhFmr94oD8ujU6P","/dns4/store-02.ac-cn-hongkong-c.shards.test.statusim.net/tcp/443/wss/p2p/16Uiu2HAm9CQhsuwPR54q27kNj9iaQVfyRzTGKrhFmr94oD8ujU6P","/dns4/store-02.ac-cn-hongkong-c.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm9CQhsuwPR54q27kNj9iaQVfyRzTGKrhFmr94oD8ujU6P"]},"16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R":{"protocols":["/ipfs/ping/1.0.0","/vac/waku/store/2.0.0-beta4","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0","/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0","/vac/waku/relay/2.0.0","/rendezvous/1.0.0"],"addresses":["/dns4/store-02.do-ams3.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R","/ip4/159.223.242.154/tcp/30303/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R","/dns4/store-02.do-ams3.shards.test.statusim.net/tcp/443/wss/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R"]},"16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT":{"protocols":["/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0","/vac/waku/relay/2.0.0","/rendezvous/1.0.0","/ipfs/ping/1.0.0","/vac/waku/store/2.0.0-beta4","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0"],"addresses":["/dns4/store-01.do-ams3.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT","/ip4/159.223.242.94/tcp/30303/p2p/16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT","/dns4/store-01.do-ams3.shards.test.statusim.net/tcp/443/wss/p2p/16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT"]},"16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM":{"protocols":["/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0","/vac/waku/relay/2.0.0","/rendezvous/1.0.0","/ipfs/ping/1.0.0","/vac/waku/store/2.0.0-beta4","/ipfs/id/1.0.0","/libp2p/autonat/1.0.0"],"addresses":["/dns4/store-02.gc-us-central1-a.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM","/ip4/34.170.154.2/tcp/30303/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM","/dns4/store-02.gc-us-central1-a.shards.test.statusim.net/tcp/443/wss/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM"]},"16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ":{"protocols":["/libp2p/autonat/1.0.0","/libp2p/circuit/relay/0.2.0/hop","/vac/waku/metadata/1.0.0","/vac/waku/relay/2.0.0","/rendezvous/1.0.0","/ipfs/ping/1.0.0","/vac/waku/store/2.0.0-beta4","/ipfs/id/1.0.0"],"addresses":["/ip4/34.170.192.39/tcp/30303/p2p/16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ","/dns4/store-01.gc-us-central1-a.shards.test.statusim.net/tcp/443/wss/p2p/16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ","/dns4/store-01.gc-us-central1-a.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ"]}}}
caybro commented 3 months ago

Suddenly I happened to be connected to many more "active" peers:

image

and the app is acting correctly; I'm receiving new messages (in fact hundreds of them!) and ppl actually seem to be online :)

osmaczko commented 3 months ago

At first glance, it appears to be a database issue. For some reason, the database connection is terminated, causing the application to malfunction.

DEBUG[03-21|17:14:34.470|github.com/status-im/status-go/protocol/transport/filters_manager.go:549]                   could not register public chat topic     namespace=filtersManager chatID=0
x0297e69bd1fa8f19c2a484d25ed17860fa41cadacb454dbee96118491f0b0b99fe error="sql: database is closed"
ERROR[03-21|17:14:34.470|github.com/status-im/status-go/protocol/messenger_store_node_request_manager.go:249]        failed to install filter for community   error="sql: database is closed"
ERROR[03-21|17:14:34.470|github.com/status-im/status-go/services/wallet/collectibles/manager.go:737]                 fetchCommunityInfo failed                communityID=0x02b5bdaf5a25fcfe2ee14c501fab1836b8de57f61621080c3d52073d16de0d98d6 err="sql: database is closed"
osmaczko commented 3 months ago

So, after few restarts, I no longer see "database is closed" error but still can't retrieve messages from peers.

What I see instead is:

ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:66]                       failed to find peers for shard and services cluster=16 shard=0 service=/vac/waku/relay/2.0.0 error="no discv5 listener"
ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:115]                      failed to discover and conenct to peers  error="no discv5 listener"
ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:66]                       failed to find peers for shard and services cluster=16 shard=1 service=/vac/waku/relay/2.0.0 error="no discv5 listener"
ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:115]                      failed to discover and conenct to peers  error="no discv5 listener"
ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:66]                       failed to find peers for shard and services cluster=16 shard=2 service=/vac/waku/relay/2.0.0 error="no discv5 listener"
ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:115]                      failed to discover and conenct to peers  error="no discv5 listener"
ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:66]                       failed to find peers for shard and services cluster=16 shard=32 service=/vac/waku/relay/2.0.0 error="no discv5 listener"
ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:115]                      failed to discover and conenct to peers  error="no discv5 listener"
ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:66]                       failed to find peers for shard and services cluster=16 shard=64 service=/vac/waku/relay/2.0.0 error="no discv5 listener"
ERROR[03-21|19:18:01.438|github.com/waku-org/go-waku/waku/v2/peermanager/peer_discovery.go:115]                      failed to discover and conenct to peers  error="no discv5 listener"

Node management tab is showing 2 peers though :thinking:

jrainville commented 3 months ago

I checked my own geth.log for my app that works fine and I don't see that message anywhere, so you might have found a clue.

osmaczko commented 3 months ago

I think I found something meaningful. Even though, the node management tab is showing me 8 peers right now, there are E_DIAL_ERRORs for them:

$ grep E_DIAL_ERROR geth.log   
DEBUG[03-21|19:33:08.565|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmAR24Mbb6VuzoyUiGx42UenDkshENVDj4qnmmbabLvo31 completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 
DEBUG[03-21|19:33:09.622|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmKBd6crqQNZ6nKCSCpHCAwUPn3DUDmkcPSWUTyVXpxKsW completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 
DEBUG[03-21|19:33:11.934|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmCDSnT8oNpMR9HH6uipD71KstYuDCAQGpek9XDAVmqdEr completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 
DEBUG[03-21|19:33:12.159|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmEqqio4UR1SWqAc7KY19t6qyDvtmyjreZpzUBJvb4u65R completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 
DEBUG[03-21|19:33:12.751|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 
DEBUG[03-21|19:33:13.013|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmGAA54bBTE78MYidSy3P7Q9yAWFNTAEReJYD69VRvtL5r completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 
DEBUG[03-21|19:33:13.988|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 
DEBUG[03-21|19:33:20.028|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR)
osmaczko commented 3 months ago

These 6 peers seems to store nodes.

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT": {
      "protocols": [
        "/rendezvous/1.0.0",
        "/ipfs/ping/1.0.0",
        "/vac/waku/store/2.0.0-beta4",
        "/ipfs/id/1.0.0",
        "/libp2p/autonat/1.0.0",
        "/libp2p/circuit/relay/0.2.0/hop",
        "/vac/waku/metadata/1.0.0",
        "/vac/waku/relay/2.0.0"
      ],
      "addresses": [
        "/dns4/store-01.ac-cn-hongkong-c.shards.test.status.im/tcp/443/wss/p2p/16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT",
        "/dns4/store-01.ac-cn-hongkong-c.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT",
        "/ip4/8.218.74.73/tcp/30303/p2p/16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT",
        "/dns4/store-01.ac-cn-hongkong-c.shards.test.status.im/tcp/30303/p2p/16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT"
      ]
    },
    "16Uiu2HAm3xVDaz6SRJ6kErwC21zBJEZjavVXg7VSkoWzaV1aMA3F": {
      "protocols": [
        "/ipfs/id/1.0.0",
        "/libp2p/autonat/1.0.0",
        "/libp2p/circuit/relay/0.2.0/hop",
        "/vac/waku/metadata/1.0.0",
        "/vac/waku/relay/2.0.0",
        "/rendezvous/1.0.0",
        "/ipfs/ping/1.0.0",
        "/vac/waku/store/2.0.0-beta4"
      ],
      "addresses": [
        "/ip4/24.144.78.119/tcp/30303/p2p/16Uiu2HAm3xVDaz6SRJ6kErwC21zBJEZjavVXg7VSkoWzaV1aMA3F",
        "/dns4/store-01.do-ams3.shards.staging.status.im/tcp/30303/p2p/16Uiu2HAm3xVDaz6SRJ6kErwC21zBJEZjavVXg7VSkoWzaV1aMA3F",
        "/dns4/store-01.do-ams3.shards.staging.status.im/tcp/443/wss/p2p/16Uiu2HAm3xVDaz6SRJ6kErwC21zBJEZjavVXg7VSkoWzaV1aMA3F"
      ]
    },
    "16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R": {
      "protocols": [
        "/libp2p/autonat/1.0.0",
        "/libp2p/circuit/relay/0.2.0/hop",
        "/vac/waku/metadata/1.0.0",
        "/vac/waku/relay/2.0.0",
        "/rendezvous/1.0.0",
        "/ipfs/ping/1.0.0",
        "/vac/waku/store/2.0.0-beta4",
        "/ipfs/id/1.0.0"
      ],
      "addresses": [
        "/ip4/159.223.242.154/tcp/30303/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R",
        "/dns4/store-02.do-ams3.shards.test.status.im/tcp/30303/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R",
        "/dns4/store-02.do-ams3.shards.test.status.im/tcp/443/wss/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R",
        "/dns4/store-02.do-ams3.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R"
      ]
    },
    "16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM": {
      "protocols": [
        "/ipfs/ping/1.0.0",
        "/vac/waku/store/2.0.0-beta4",
        "/ipfs/id/1.0.0",
        "/libp2p/autonat/1.0.0",
        "/libp2p/circuit/relay/0.2.0/hop",
        "/vac/waku/metadata/1.0.0",
        "/vac/waku/relay/2.0.0",
        "/rendezvous/1.0.0"
      ],
      "addresses": [
        "/dns4/store-02.gc-us-central1-a.shards.test.statusim.net/tcp/30303/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM",
        "/ip4/34.170.154.2/tcp/30303/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM",
        "/dns4/store-02.gc-us-central1-a.shards.test.status.im/tcp/30303/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM",
        "/dns4/store-02.gc-us-central1-a.shards.test.status.im/tcp/443/wss/p2p/16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM"
      ]
    },
    "16Uiu2HAmKBd6crqQNZ6nKCSCpHCAwUPn3DUDmkcPSWUTyVXpxKsW": {
      "protocols": [
        "/libp2p/circuit/relay/0.2.0/hop",
        "/vac/waku/metadata/1.0.0",
        "/vac/waku/relay/2.0.0",
        "/rendezvous/1.0.0",
        "/ipfs/ping/1.0.0",
        "/vac/waku/store/2.0.0-beta4",
        "/ipfs/id/1.0.0",
        "/libp2p/autonat/1.0.0"
      ],
      "addresses": [
        "/dns4/store-02.gc-us-central1-a.shards.staging.status.im/tcp/30303/p2p/16Uiu2HAmKBd6crqQNZ6nKCSCpHCAwUPn3DUDmkcPSWUTyVXpxKsW",
        "/dns4/store-02.gc-us-central1-a.shards.staging.status.im/tcp/443/wss/p2p/16Uiu2HAmKBd6crqQNZ6nKCSCpHCAwUPn3DUDmkcPSWUTyVXpxKsW",
        "/ip4/34.72.140.183/tcp/30303/p2p/16Uiu2HAmKBd6crqQNZ6nKCSCpHCAwUPn3DUDmkcPSWUTyVXpxKsW"
      ]
    },
    "16Uiu2HAmU7xtcwytXpGpeDrfyhJkiFvTkQbLB9upL5MXPLGceG9K": {
      "protocols": [
        "/ipfs/ping/1.0.0",
        "/vac/waku/store/2.0.0-beta4",
        "/ipfs/id/1.0.0",
        "/libp2p/autonat/1.0.0",
        "/libp2p/circuit/relay/0.2.0/hop",
        "/vac/waku/metadata/1.0.0",
        "/vac/waku/relay/2.0.0",
        "/rendezvous/1.0.0"
      ],
      "addresses": [
        "/ip4/47.76.178.164/tcp/30303/p2p/16Uiu2HAmU7xtcwytXpGpeDrfyhJkiFvTkQbLB9upL5MXPLGceG9K",
        "/dns4/store-02.ac-cn-hongkong-c.shards.staging.status.im/tcp/30303/p2p/16Uiu2HAmU7xtcwytXpGpeDrfyhJkiFvTkQbLB9upL5MXPLGceG9K",
        "/dns4/store-02.ac-cn-hongkong-c.shards.staging.status.im/tcp/443/wss/p2p/16Uiu2HAmU7xtcwytXpGpeDrfyhJkiFvTkQbLB9upL5MXPLGceG9K"
      ]
    }
  }
}
jrainville commented 3 months ago

I can now reproduce the issue as well and I see [AutoNAT error](err: AutoNAT error: Dial timeout)` as well

jrainville commented 3 months ago

If I check the old geth.log, I don't see that error, so it's new since I restarted and since I can reproduce, so it seems like a real clue.

Let me know if you want my peer list

osmaczko commented 3 months ago

Since my peers are store nodes, I decided to use fetch messages from chat's kebab menu, with hope it will fetch missing messages. What I get instead is endless loop of errors:

ERROR[03-21|19:56:33.774|github.com/status-im/status-go/wakuv2/common/message.go:233]                                failed to decode message                 err="couldn't decrypt using asymmetric key: ecies: invalid message"
ERROR[03-21|19:56:33.824|github.com/status-im/status-go/wakuv2/common/message.go:233]                                failed to decode message                 err="couldn't decrypt using asymmetric key: ecies: invalid message"
ERROR[03-21|19:56:33.887|github.com/status-im/status-go/wakuv2/common/message.go:233]                                failed to decode message                 err="couldn't decrypt using asymmetric key: ecies: invalid message"
ERROR[03-21|19:56:33.943|github.com/status-im/status-go/wakuv2/common/message.go:233]                                failed to decode message                 err="couldn't decrypt using asymmetric key: ecies: invalid message"
ERROR[03-21|19:56:34.007|github.com/status-im/status-go/wakuv2/common/message.go:233]                                failed to decode message                 err="couldn't decrypt using asymmetric key: ecies: invalid message"
ERROR[03-21|19:56:34.100|github.com/status-im/status-go/wakuv2/common/message.go:233]                                failed to decode message                 err="couldn't decrypt using asymmetric key: ecies: invalid message"
ERROR[03-21|19:56:34.141|github.com/status-im/status-go/wakuv2/common/message.go:233]                                failed to decode message                 err="couldn't decrypt using asymmetric key: ecies: invalid message"
ERROR[03-21|19:56:34.198|github.com/status-im/status-go/wakuv2/common/message.go:233]                                failed to decode message                 err="couldn't decrypt using asymmetric key: ecies: invalid message"
ERROR[03-21|19:56:34.252|github.com/status-im/status-go/wakuv2/common/message.go:233]                                failed to decode message                 err="couldn't decrypt using asymmetric key: ecies: invalid message"
osmaczko commented 3 months ago

@richard-ramos could you please take a look at the above investigation and give me any clues? Not sure which of these are serious problems and which are false-positives..

jrainville commented 3 months ago

I restarted my app and now I can receive messages again. Obviously I could fetch from store nodes as is listed in the issue, but then I tried messaging myself in 1-1 and in the community and both worked.

It makes me wonder if maybe there is a chance that we connect to "bad" peers? Like Patryk said, if we only connect to store nodes as peers, maybe that causes the problem?

richard-ramos commented 3 months ago

@osmaczko i find it curious that you have peers from shards.staging and shards.test at the same time

jrainville commented 3 months ago

Here are my peers when the app is working:

{
  "jsonrpc": "2.0",
  "id": 0,
  "result": {
    "16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT": {},
    "16Uiu2HAm8mUZ18tBWPXDQsaF7PbCKYA35z7WB2xNZH2EVq1qS8LJ": {},
    "16Uiu2HAmAR24Mbb6VuzoyUiGx42UenDkshENVDj4qnmmbabLvo31": {},
    "16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT": {},
    "16Uiu2HAmB7Ur9HQqo3cWDPovRQjo57fxWWDaQx27WxSzDGhN4JKg": {},
    "16Uiu2HAmEqqio4UR1SWqAc7KY19t6qyDvtmyjreZpzUBJvb4u65R": {},
    "16Uiu2HAmGwcE8v7gmJNEWFtZtojYpPMTHy2jBLL6xRk33qgDxFWX": {},
    "16Uiu2HAmKBd6crqQNZ6nKCSCpHCAwUPn3DUDmkcPSWUTyVXpxKsW": {},
    "16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ": {},
    "16Uiu2HAmU7xtcwytXpGpeDrfyhJkiFvTkQbLB9upL5MXPLGceG9K": {}
  }
}

Here are Patryk's:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT": {},
    "16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT": {},
    "16Uiu2HAmJnVR7ZzFaYvciPVafUXuYGLHPzSUigqAmeNw9nJUVGeM": {},
    "16Uiu2HAmKBd6crqQNZ6nKCSCpHCAwUPn3DUDmkcPSWUTyVXpxKsW": {},
    "16Uiu2HAmU7xtcwytXpGpeDrfyhJkiFvTkQbLB9upL5MXPLGceG9K": {}
  }
}
jrainville commented 3 months ago

I let my app run for a few hours (locked screen) and when I came back, I could no longer receive messages. I did't close the app since my previous message above, so we can compare the peers:

{
  "jsonrpc": "2.0",
  "id": 0,
  "result": {
    "16Uiu2HAm2M7xs7cLPc3jamawkEqbr7cUJX11uvY7LxQ6WFUdUKUT": {},
    "16Uiu2HAm3xVDaz6SRJ6kErwC21zBJEZjavVXg7VSkoWzaV1aMA3F": {},
    "16Uiu2HAm9CQhsuwPR54q27kNj9iaQVfyRzTGKrhFmr94oD8ujU6P": {},
    "16Uiu2HAm9aDJPkhGxc2SFcEACTFdZ91Q5TJjp76qZEhq9iF59x7R": {},
    "16Uiu2HAmAUdrQ3uwzuE4Gy4D56hX6uLKEeerJAnhKEHZ3DxF1EfT": {},
    "16Uiu2HAmCDSnT8oNpMR9HH6uipD71KstYuDCAQGpek9XDAVmqdEr": {
      "addresses": null
    },
    "16Uiu2HAmMELCo218hncCtTvC2Dwbej3rbyHQcR8erXNnKGei7WPZ": {},
    "16Uiu2HAmU7xtcwytXpGpeDrfyhJkiFvTkQbLB9upL5MXPLGceG9K": {
      "addresses": null
    }
  }
}

See how two of them have null addresses. I'm not sure what that means

osmaczko commented 3 months ago

I switched back to 2.27.0 RC6 and created new account, messaging is still broken. Logs say the same:

DEBUG[03-22|15:14:36.758|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmAR24Mbb6VuzoyUiGx42UenDkshENVDj4qnmmbabLvo3
1 completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 
DEBUG[03-22|15:14:37.928|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmB7Ur9HQqo3cWDPovRQjo57fxWWDaQx27WxSzDGhN4JK
g completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 
DEBUG[03-22|15:14:40.240|go.uber.org/zap/sugar.go:163]                                                               Dialback through peer 16Uiu2HAmKBd6crqQNZ6nKCSCpHCAwUPn3DUDmkcPSWUTyVXpxKs
W completed: err: AutoNAT error: Dial timeout (E_DIAL_ERROR) 

Maybe it is the issue of store nodes and boot nodes themselves?

osmaczko commented 3 months ago

Root cause was https://github.com/status-im/status-go/pull/4961

jrainville commented 3 months ago

Fixed here https://github.com/status-im/status-go/pull/4975