status-im / status-desktop

Status Desktop client made in Nim & QML
https://status.app
Mozilla Public License 2.0
297 stars 79 forks source link

Constant crash on login on 2.28.1 with `Segmentation fault: 11` #14643

Closed churik closed 4 months ago

churik commented 6 months ago

Bug Report

Description

Cannot use certain accounts on 2.28.1

Steps to reproduce

Expected behavior

can login

Actual behavior

INF 2024-05-08 16:23:54.432+02:00 mailserver not working                     topics="mailservers-service" tid=14581093 file=service.nim:161
INF 2024-05-08 16:23:54.432+02:00 removing active mailserver                 topics="mailservers-service" tid=14581093 file=service.nim:149
2024-05-08T16:23:54+0200 unset [github.com/anacrolix/torrent.(*Client).NewAnacrolixDhtServer.func2:406]: dht server on [::]:62728 (node id cef8087d8d3eedcb4e7d7798ab749fca043c0175) completed bootstrap ({NumAddrsTried:8 NumResponses:0})
2024-05-08T16:23:54.829+0200    INFO    communities/manager.go:382  failed to request missing messages  {"error": "stream reset"}
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
Segmentation fault: 11

geth.log

Additional Information

friofry commented 6 months ago

I have a guess: The sigsegv could be related to access to the unitialized TorrentClient. DisableCommunityHistoryArchiveProtocol calls StopTorrentClient (m.torrentClient = null)

I have an open PR for a similar problem: https://github.com/status-im/status-go/pull/5125#discussion_r1593650620

friofry commented 6 months ago

I was unable to reproduce the crash. @churik gave me the app settings

image

image

And with this settings I cannot login anymore:

image

and logs

DBG 2024-05-09 18:06:15.180+04:00 secondary_action                           topics="app-controller" tid=3616660 file=module.nim:218 currFlow=AppLogin currState=Login
DBG 2024-05-09 18:06:15.187+04:00 secondary_action - set state               topics="app-controller" tid=3616660 file=module.nim:224 setCurrFlow=AppLogin setCurrState=LoginKeycardEnterPassword
DBG 2024-05-09 18:06:18.776+04:00 primary_action                             topics="app-controller" tid=3616660 file=module.nim:205 currFlow=AppLogin currState=LoginKeycardEnterPassword
DBG 2024-05-09 18:06:18.776+04:00 app state changed                          tid=3616660 file=view.nim:130 state=AppLoadingState
DBG 2024-05-09 18:06:18.793+04:00 database reencryption scheduled            topics="accounts-service" tid=3616660 file=service.nim:693
DBG 2024-05-09 18:06:18.794+04:00 [threadpool task thread] initiating task   topics="task-threadpool" tid=3616741 file=threadpool.nim:54 messageType=TimerTaskArg:ObjectType threadid=3616741 task="{\"$type\":\"TimerTaskArg:ObjectType\",\"timeoutInMilliseconds\":1000,\"reason\":\"\",\"vptr\":105553135869344,\"slot\":\"onWaitForReencryptionTimeout\",\"tptr\":4342669728}"
DBG 2024-05-09 18:06:19.801+04:00 starting database reencryption             topics="accounts-service" tid=3616660 file=service.nim:716
failed to set `journal_mode` pragma: file is not a database
DBG 2024-05-09 18:06:20.364+04:00 account logged in                          topics="accounts-service" tid=3616660 file=service.nim:677
ERR 2024-05-09 18:06:21.446+04:00 node.crashed                               topics="signals-manager" tid=3616660 file=signals_manager.nim:58 error="{\"type\":\"node.crashed\",\"event\":{\"error\":\"TorrentConfig.DataDir and TorrentConfig.TorrentDir cannot be \\\"\\\"\"}}"
ERR 2024-05-09 18:06:21.447+04:00 node.login                                 topics="signals-manager" tid=3616660 file=signals_manager.nim:54 error="TorrentConfig.DataDir and TorrentConfig.TorrentDir cannot be \"\""
DBG 2024-05-09 18:06:21.447+04:00 app state changed                          tid=3616660 file=view.nim:130 state=StartupState
ERR 2024-05-09 18:06:21.448+04:00 login error                                topics="app-controller" tid=3616660 file=module.nim:419 methodName=onNodeLogin errDesription="TorrentConfig.DataDir and TorrentConfig.TorrentDir cannot be \"\"" 

Working on a fix for this

friofry commented 5 months ago

I was able to reproduce the above problem simply by :

@MishkaRogachev We are probably working on the same components (#14235). I also have a problem with malformed torrent config

MishkaRogachev commented 5 months ago

@MishkaRogachev We are probably working on the same components (#14235). I also have a problem with malformed torrent config

An issue I'm working on is a little different, but I'm experimenting a lot with the history archive. I'll post an update if I manage to reproduce a similar problem.

friofry commented 5 months ago

Narrowed down the reproduction steps:

  1. create a new account
  2. enable "archive protocol"
  3. turn off "debug" flag

On step 3. The torrent config gets corrupted. Because local version of nodeConfiguration.TorrentConfig is empty on Nim side (while on status go side it has valid data). It happens because when we call SaveSettings it writes empty TorrentConfig to database.

I've added node_config.Validate() before saving to database and now it fails each time I change node_config on Nim side (missing APIModules, KeycardPairingDataFile and empty Torrent dirs)

Added the following test:

func TestConfigValidate(t *testing.T) {
    // GIVEN
    db, stop := setupTestDB(t)
    defer stop()

    tmpdir := t.TempDir()
    nodeConfig, err := params.NewNodeConfig(tmpdir, 1777)

    require.NoError(t, err)
    err = nodeConfig.Validate()
    require.NoError(t, err)
    require.NoError(t, nodecfg.SaveNodeConfig(db, nodeConfig))

    // WHEN
    dbNodeConfig, err := nodecfg.GetNodeConfigFromDB(db)
    require.NoError(t, err)

    // THEN
    err = dbNodeConfig.Validate()
    require.NoError(t, err)
}

it fails with:

Error:          Received unexpected error:
    Key: 'NodeConfig.KeycardPairingDataFile' Error:Field validation for 'KeycardPairingDataFile' failed on the 'required' tag

And indeed these fields are marked as required/validated in config.go:

KeycardPairingDataFile string validate:"required" 
APIModules string `validate:"required"`
LogLevel string `validate:"eq=ERROR|eq=WARN|eq=INFO|eq=DEBUG|eq=TRACE"`

but KeycardPairingDataFile, LogLevel columns not exist in node_config table (1640111208_nodeconfig.up.sql)

My plan:

friofry commented 5 months ago

Another issue with SaveSettings.

The TorrentConfig struct in dto/node_config.nim is:

  TorrentConfig* = object
    Enabled*: bool

and in config.go it's:

// TorrentConfig provides configuration for the BitTorrent client used for message history archives.
type TorrentConfig struct {
    // Enabled set to true enables Community History Archive protocol
    Enabled bool
    // Port number which the BitTorrent client will listen to for conntections
    Port int
    // DataDir is the file system folder Status should use for message archive torrent data.
    DataDir string
    // TorrentDir is the file system folder Status should use for storing torrent metadata files.
    TorrentDir string
}

It's converted to json and unmarshalled in value_handlers.go NodeConfigHandler and some fields are missing. I can either mark them as required and add them on the nim side.

Or change the logic and make SaveSettings override only the values that are explicitly set. We can merge the incoming json with the existing one (but this logic is not perfect). It would be useful because whenever a new field is added to NodeConfig on the status-go side and the nim side isn't updated. Saving settings will not override new values.

friofry commented 5 months ago

Quick update. I found a better approach in backend/node_config.nim, which calls status-go methods to update only the required fields. EnableCommunityHistoryArchiveSupport

And saveNodeConfiguration from settings/service.nim should be avoided. It will always break NodeConfig.

churik commented 4 months ago

FYI: on latest master (17/06) on the same account this issue is still reproducible to me.

ERR 2024-06-17 16:00:29.871+02:00 error:                                     topics="accounts-service" tid=41948855 file=service.nim:655 procName=verifyDatabasePassword errDesription="failed to set `journal_mode` pragma: file is not a database"
ERROR[06-17|16:00:30.051|github.com/status-im/status-go/api/geth_backend.go:434] failed to initialize wallet db           package=status-go/api.GethStatusBackend err="failed to set `journal_mode` pragma: file is not a database"
ERROR[06-17|16:00:30.051|github.com/status-im/status-go/mobile/status.go:319]    loginAccount failed                      error="failed to set `journal_mode` pragma: file is not a database"
ERR 2024-06-17 16:00:30.052+02:00 node.login                                 topics="signals-manager" tid=41948855 file=signals_manager.nim:54 error="failed to set `journal_mode` pragma: file is not a database"
ERR 2024-06-17 16:00:30.053+02:00 login error                                topics="app-controller" tid=41948855 file=module.nim:419 methodName=onNodeLogin errDesription="failed to set `journal_mode` pragma: file is not a database"
Warning: QQuickItem::stackAfter: Cannot stack AccountMenuItemPanel_QMLTYPE_3358(0x6000036c8dc0, parent=0x6000036f7de0, geometry=0,0 342x64) after AccountMenuItemPanel_QMLTYPE_3358(0x6000036e53f0), which must be a sibling (:0, )
ERROR[06-17|16:00:39.173|github.com/status-im/status-go/api/geth_backend.go:434] failed to initialize wallet db           package=status-go/api.GethStatusBackend err="failed to set `journal_mode` pragma: file is not a database"
failed to set `journal_mode` pragma: file is not a database
ERR 2024-06-17 16:00:39.173+02:00 error:                                     topics="accounts-service" tid=41948855 file=service.nim:655 procName=verifyDatabasePassword errDesription="failed to set `journal_mode` pragma: file is not a database"
ERROR[06-17|16:00:39.357|github.com/status-im/status-go/api/geth_backend.go:434] failed to initialize wallet db           package=status-go/api.GethStatusBackend err="failed to set `journal_mode` pragma: file is not a database"
ERROR[06-17|16:00:39.357|github.com/status-im/status-go/mobile/status.go:319]    loginAccount failed                      error="failed to set `journal_mode` pragma: file is not a database"
ERR 2024-06-17 16:00:39.358+02:00 node.login                                 topics="signals-manager" tid=41948855 file=signals_manager.nim:54 error="failed to set `journal_mode` pragma: file is not a database"
ERR 2024-06-17 16:00:39.359+02:00 login error                                topics="app-controller" tid=41948855 file=module.nim:419 methodName=onNodeLogin errDesription="failed to set `journal_mode` pragma: file is not a database"
ERROR[06-17|16:00:47.448|github.com/status-im/status-go/api/geth_backend.go:434] failed to initialize wallet db           package=status-go/api.GethStatusBackend err="failed to set `journal_mode` pragma: file is not a database"
failed to set `journal_mode` pragma: file is not a database
ERR 2024-06-17 16:00:47.448+02:00 error:                                     topics="accounts-service" tid=41948855 file=service.nim:655 procName=verifyDatabasePassword errDesription="failed to set `journal_mode` pragma: file is not a database"
WARN [06-17|16:00:47.821|github.com/status-im/status-go/api/geth_backend.go:724] fleet is not supported, overriding with default value package=status-go/api.GethStatusBackend fleet=status.prod defaultFleet=shards.test
WRN 2024-06-17 16:00:47.865+02:00 Error decoding signal                      topics="signals-manager" tid=41948855 file=signals_manager.nim:49 err="Unknown signal received: mediaserver.started"
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
Segmentation fault: 11

I'll reopen the issue.

churik commented 4 months ago

ah, no, I can't :) cc @friofry @anastasiyaig

jrainville commented 4 months ago

@friofry was the fix you did only prevent that type of crash from happening again or it would also unstuck people like @churik ?

If it was the former, how big of an effort is it to make it fix broken states? If it's too hard, maybe you can just help her unblock it manually, since I don't think other people got this situation.

churik commented 4 months ago

Attaching geth.log for this user: geth.log

friofry commented 4 months ago

was the fix you did only prevent that type of crash from happening again or it would also unstuck people like It turned out I fixed a non-related crash.

But it's hard to tell what's wrong with the new crash without a database or steps to reproduce it.

The error in the original logs doesn't look the same as in the new config.

INFO [06-19|16:03:43.785|github.com/status-im/status-go/protocol/messenger.go:363]                       failed to find Waku service; trying WakuV2 error="waku service is not available"

and in the original log this line is also there, but 5 seconds before the crash.

@jrainville Yes, it would be hard to debug this without a hint. We've discussed with @churik that this test account is not critical and it's not easy to get the database. I suggest to close it with "cannot reproduce" status for now until there will be a steps to reproduce.

jrainville commented 4 months ago

Closing as we can only reproduce on one test account. We'll reopen if it happens to others