signalapp / Signal-Desktop

A private messenger for Windows, macOS, and Linux.
https://signal.org/download
GNU Affero General Public License v3.0
14.19k stars 2.59k forks source link

signal desktop crashes on startup after one minute delay #6578

Closed signal-5flat closed 10 months ago

signal-5flat commented 10 months ago

Bug Description

Upon startup, I get the usual graphic window with three little animated circles. I see this for about 60 seconds and it crashes (SIGABRT). While monitoring with "top", I notice the process steadily grows in size during this 60 seconds and gets to about 5 Gigabytes before the crash. Two additional notes:

1) For approximately the last week or two, Signal has been taking 30-40 seconds to start. It was previously very, very fast. So I am guessing that there is some kind of memory leak on startup that has gotten progressively worse and is now over some one-minute abort threshold. But that's just a wild guess.

2) Even restoring yesterday's backup of ~/Library/Application\ Support/Signal did not fix the problem which seems very strange. That should contain all the state information, no? So if it ran with that state yesterday, I don't understand why it does not run now.

Platform Info

Signal Version: 6.28.0

Operating System: MacOS Ventura 13.5

Link to Debug Log

stdout.txt apple_report.txt

I ran it with crash logs enabled, but could not get it to upload them. If you tell me where they are located on the disk, I can put them on my server and send you a link. Thanks.

signal-5flat commented 10 months ago

Forgot the log files:

logs.zip

jamiebuilds-signal commented 10 months ago

Sorry about that. Just noting some details as we diagnose this:

It looks like this was an out of memory error

Thread 0 Crashed:: CrBrowserMain Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib                 0x193af4764 __pthread_kill + 8
1   libsystem_pthread.dylib                0x193b2bc28 pthread_kill + 288
2   libsystem_c.dylib                      0x193a39ae8 abort + 180
3   Electron Framework                     0x11380ebac node::Buffer::New(v8::Isolate*, char*, unsigned long) + 160456
4   Electron Framework                     0x11380ede0 node::OnFatalError(char const*, char const*) + 552
5   Electron Framework                     0x10e3903b0 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) + 676
6   Electron Framework                     0x10e390338 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, v8::OOMDetails const&) + 556
7   Electron Framework                     0x10e54a5a4 v8::internal::Heap::FinalizeIncrementalMarkingAtomically(v8::internal::GarbageCollectionReason) + 1484

Looks like it happened during the loader phase:

{"level":30,"time":"2023-08-16T16:53:17.086Z","msg":"starting version 6.28.0"}
{"level":30,"time":"2023-08-16T16:53:17.112Z","msg":"media access status granted granted"}
{"level":30,"time":"2023-08-16T16:53:17.113Z","msg":"got fast theme-setting value dark"}
{"level":30,"time":"2023-08-16T16:53:17.114Z","msg":"got fast theme-setting value dark"}
{"level":30,"time":"2023-08-16T16:53:17.114Z","msg":"got fast spellcheck setting true"}
{"level":30,"time":"2023-08-16T16:53:17.115Z","msg":"getSystemTraySetting had no flags and did no DB lookups. Returning DoNotUseSystemTray"}
{"level":30,"time":"2023-08-16T16:53:17.115Z","msg":"Initializing BrowserWindow config: {\"show\":false,\"width\":706,\"height\":604,\"minWidth\":300,\"minHeight\":200,\"autoHideMenuBar\":false,\"titleBarStyle\":\"hidden\",\"titleBarOverlay\":false,\"backgroundColor\":\"#121212\",\"webPreferences\":{\"devTools\":false,\"spellcheck\":true,\"enableBlinkFeatures\":\"CSSPseudoDir,CSSLogical\",\"nodeIntegration\":false,\"nodeIntegrationInWorker\":false,\"sandbox\":false,\"contextIsolation\":true,\"preload\":\"[REDACTED]/preload.bundle.js\",\"backgroundThrottling\":true,\"enablePreferredSizeMode\":true,\"disableBlinkFeatures\":\"Accelerated2dCanvas,AcceleratedSmallCanvases\"},\"icon\":\"[REDACTED]/build/icons/png/512x512.png\",\"x\":16,\"y\":505}"}
{"level":30,"time":"2023-08-16T16:53:17.156Z","msg":"spellcheck: user locales: [\"en-US\",\"en-CH\"]"}
{"level":30,"time":"2023-08-16T16:53:17.156Z","msg":"spellcheck: available spellchecker languages: [\"af\",\"bg\",\"ca\",\"cs\",\"cy\",\"da\",\"de\",\"de-DE\",\"el\",\"en\",\"en-AU\",\"en-CA\",\"en-GB\",\"en-GB-oxendict\",\"en-US\",\"es\",\"es-419\",\"es-AR\",\"es-ES\",\"es-MX\",\"es-US\",\"et\",\"fa\",\"fo\",\"fr\",\"fr-FR\",\"he\",\"hi\",\"hr\",\"hu\",\"hy\",\"id\",\"it\",\"it-IT\",\"ko\",\"lt\",\"lv\",\"nb\",\"nl\",\"pl\",\"pt\",\"pt-BR\",\"pt-PT\",\"ro\",\"ru\",\"sh\",\"sk\",\"sl\",\"sq\",\"sr\",\"sv\",\"ta\",\"tg\",\"tr\",\"uk\",\"vi\"]"}
{"level":30,"time":"2023-08-16T16:53:17.156Z","msg":"spellcheck: setting languages to: [\"en-US\",\"en\"]"}
{"level":30,"time":"2023-08-16T16:53:17.161Z","msg":"MainSQL: updateSchema:\n  Current user_version: 86;\n  Most recent db schema: 86;\n  SQLite version: 3.39.4;\n  SQLCipher version: 4.5.3 community;\n  (deprecated) schema_version: 371;\n"}
{"level":30,"time":"2023-08-16T16:53:17.456Z","msg":"got fast theme-setting value dark"}
{"level":30,"time":"2023-08-16T16:53:17.949Z","msg":"got fast spellcheck setting true"}
{"level":30,"time":"2023-08-16T16:53:17.952Z","msg":"System tray service: created"}
{"level":30,"time":"2023-08-16T16:53:17.952Z","msg":"System tray service: updating main window. Previously, there was not a window, and now there is"}
{"level":30,"time":"2023-08-16T16:53:17.952Z","msg":"System tray service: rendering no tray"}
{"level":30,"time":"2023-08-16T16:53:17.953Z","msg":"Begin ensuring permissions"}
{"level":30,"time":"2023-08-16T16:53:17.964Z","msg":"main window is ready-to-show"}
{"level":30,"time":"2023-08-16T16:53:17.966Z","msg":"showing main window"}
{"level":30,"time":"2023-08-16T16:53:17.971Z","msg":"Ensuring file permissions for 4 files"}
{"level":30,"time":"2023-08-16T16:53:17.972Z","msg":"Finish ensuring permissions in 19ms"}
{"level":30,"time":"2023-08-16T16:53:17.986Z","msg":"System tray service: rendering no tray"}
{"level":30,"time":"2023-08-16T16:53:18.678Z","msg":"MainSQL: slow query getAllStories duration=666ms"}

Question, on this line, the getAllStories() query takes a really long time, do you receive a lot of stories from your contacts?

MainSQL: slow query getAllStories duration=666ms
signal-5flat commented 10 months ago

I don't believe anyone has ever sent me a story. Not even sure what they are, to be honest. I have about 30 contacts. Mostly just friends and family sending standard texts, photos, and videos. Nothing advanced. And I belong to three group chats with a max of 10 people on them also just sharing texts and photos. I do use Signal a lot, so my database is rather large:

$ ls -l total 5250792 -rw-------@ 1 xxxxx staff 2684715008 Aug 16 15:13 db.sqlite -rw-------@ 1 xxxxx staff 32768 Aug 16 19:29 db.sqlite-shm -rw-------@ 1 xxxxx staff 2653312 Aug 16 19:30 db.sqlite-wal

Running out of memory sounds like a reasonable cause since the executable grew steadily and crashed at just over 5 Gig.

signal-5flat commented 10 months ago

Also note that the "slow query" message happens very soon after starting the app. Then there is no output for about a full minute until it crashes. So whatever it's doing after getAllStories is probably the culprit. Let me know if I can enable more logging or send a core dump or whatever.

saitejal commented 10 months ago

I'm noticing similar issues. Sometimes the app does start up but taking > 4GB memory. On macOS 13.5 with latest Signal Desktop app.

signal-5flat commented 10 months ago

You can confirm that the problem is the same by looking in the log file:

~/Library/Application Support/Signal/logs/main.log

If you see a line that looks like this:

{"level":30,"time":"2023-08-16T03:19:05.261Z","msg":"MainSQL: slow query getAllPreKeys duration=44418ms"}

Then it's the same problem. Notice that it took 44 seconds to run the function getAllPreKeys. I looked in the code and this is a simple database SELECT statement. No way it should take that long.

I also confirmed that the problem exists on an Intel Mac, so the problem is not just on Apple Silicon. I think something is very wrong with the better-sqlite3 library or with the interface between that and JavaScript. This is probably happening to a lot of users. Seems like it should be escalated.

saitejal commented 10 months ago

{"level":30,"time":"2023-08-16T00:59:00.729Z","msg":"MainSQL: slow query getAllPreKeys duration=93904ms"}

Since few days back my problem got escalated. Now the app contentiously crashes. I attempted to open a copy of sql/db.sqlite with SQLiteBrowser and I'm seeing Could not open database file. Reason: malformed database schema (messages_on_insert_insert_mentions) - near ">>": syntax error

Should I open a separate ticket?

signal-5flat commented 10 months ago

Sounds like exactly my issue, except you are a few days behind me. I wonder how many others will be crashing in the coming days.

I think Signal uses better-sqlite3 instead of standard sqlite. So hopefully that is why SQLiteBrowser could not open the file.

I tried using the better-sqlite3.node libraries that are distributed with Signal, but they are compiled with a newer API version than even the Node.js nightly builds support (built with version 116, whereas Node.js is 115). Not sure how they managed that. And not sure how to work around it.

sevens commented 10 months ago

I seem to be having a similar/the same issue on Linux (Slackware 64-bit). App opens for a minute or two with the loading screen (3 dots) and then crashes. Just checked the memory and it increases a lot, up to about 4.9GB just before it crashed.

I have various 'slow query' log entries as well, e.g. (did a clean install so only a few entries):

$   grep -r 'slow query' logs/
logs/main.log:242:{"level":30,"time":"2023-08-20T19:49:42.978Z","msg":"MainSQL: slow query getKnownMessageAttachments duration=42ms"}
logs/main.log:246:{"level":30,"time":"2023-08-20T19:49:43.118Z","msg":"MainSQL: slow query getAllStories duration=100ms"}
logs/main.log:280:{"level":30,"time":"2023-08-20T19:52:31.541Z","msg":"MainSQL: slow query getAllStories duration=103ms"}
logs/main.log:338:{"level":30,"time":"2023-08-20T19:57:26.903Z","msg":"MainSQL: slow query getAllStories duration=89ms"}
logs/main.log:371:{"level":30,"time":"2023-08-20T19:59:42.798Z","msg":"MainSQL: slow query getAllStories duration=108ms"}
logs/main.log:404:{"level":30,"time":"2023-08-20T20:03:39.297Z","msg":"MainSQL: slow query getAllStories duration=101ms"}

My database is quite large as well:

$  ls -al sql/
total 3073516
drwx------  2 user users       4096 08-10 19:45 ./
drwx------ 22 user users       4096 08-20 21:42 ../
-rw-------  1 user users 3142987776 08-19 10:38 db.sqlite
-rw-------  1 user users      32768 08-20 21:43 db.sqlite-shm
-rw-------  1 user users    4247752 08-20 21:43 db.sqlite-wal

I backed up my Signal conf dir, started Signal desktop to create a new one, shut it down, copied the attachments, database (sql/db.sqlite) and config.json as recommended in https://github.com/signalapp/Signal-Desktop/issues/5726#issuecomment-1009293197 (don't use any stickers so didn't copy those) and it didn't help. Removing the copied attachments again didn't help either further leading me to think the problem is in the DB.

Starting the app from command-line exits eventually with Trace/breakpoint trap. Running it with strace that line gets preceeded with +++ killed by SIGTRAP +++.

Also this started happening without any Signal updates (I have to update the app manually), i.e. it started going from working to broken whilst the same app version was installed.

Currently running version 6.28.0, issue started appearing with 6.27.1.

scottnonnenberg-signal commented 10 months ago

Everyone: Please upgrade to 6.29.0 or 6.30.0-beta.1 - it should solve these Out of Memory errors on startup.

saitejal commented 10 months ago

It took a bit of time of the first launch to complete (schema update?) {"level":30,"time":"2023-08-22T00:48:01.150Z","msg":"App loaded - time: 232011"} {"level":30,"time":"2023-08-22T00:48:01.150Z","msg":"SQL init - time: 228353"} {"level":30,"time":"2023-08-22T00:48:01.150Z","msg":"Preload - time: 1010"} {"level":30,"time":"2023-08-22T00:48:01.150Z","msg":"WebSocket connect - time: 227179"} {"level":30,"time":"2023-08-22T00:48:01.150Z","msg":"Processed count: 29"}

But subsequent launches back to normal: {"level":30,"time":"2023-08-22T00:50:38.998Z","msg":"App loaded - time: 3287"} {"level":30,"time":"2023-08-22T00:50:38.998Z","msg":"SQL init - time: 71"} {"level":30,"time":"2023-08-22T00:50:38.998Z","msg":"Preload - time: 799"} {"level":30,"time":"2023-08-22T00:50:38.998Z","msg":"WebSocket connect - time: 1734"} {"level":30,"time":"2023-08-22T00:50:38.998Z","msg":"Processed count: 0"} {"level":30,"time":"2023-08-22T00:50:38.998Z","msg":"Messages per second: 0"} Thank you for the note Scott!

saitejal commented 10 months ago

~Looks like images and attachments are gone? Including the group and contact images? Perhaps this should be separate ticket?~

Sorry, the copy I was working with didn't have attachments and that resulted in broken links in the messages.

signal-5flat commented 10 months ago

Everything looks good here. Many thanks for your efforts.

palant commented 10 months ago

I’ve experienced this issue as well. It turned out to be the preKeys table growing uncontrollably, which at some point caused my database to grow to 2 GB (4.5 million prekeys).

The root cause appears to be this change: https://github.com/signalapp/Signal-Desktop/commit/b6445a6af068c9121787191992019d1b03016325. The original code contained the following comment (function rotateSignedPreKey in AccountManager.ts):

        // We swallow any error here, because we don't want to get into
        //   a loop of repeated retries.

The new code no longer has this comment and does not swallow the error either. So failing to encrypt prekeys due to missing identity key causes a retry after ~2 minutes and 100 new prekeys.

For reference, excerpt from my logs:

{"level":30,"time":"2023-08-21T16:52:43.582Z","msg":"UpdateKeysListener: Next update scheduled for 2023-08-21T09:47:32.467Z"}
{"level":30,"time":"2023-08-21T16:52:43.583Z","msg":"UpdateKeysListener: Updating keys..."}
{"level":30,"time":"2023-08-21T16:52:43.583Z","msg":"GET (REST) https://chat.signal.org/v2/keys?identity=aci"}
{"level":30,"time":"2023-08-21T16:52:43.923Z","msg":"createHTTPSAgent.createConnection(chat.signal.org): connected to IPv6 addr after 335ms (attempts v4=0 v6=1)"}
{"level":30,"time":"2023-08-21T16:52:44.039Z","msg":"GET (REST) https://chat.signal.org/v2/keys?identity=aci 200 Success"}
{"level":40,"time":"2023-08-21T16:52:44.040Z","msg":"maybeUpdateLastResortKyberKey(ACI): 5 confirmed keys, most recent was created 1692611253144. No need to update."}
{"level":40,"time":"2023-08-21T16:52:44.040Z","msg":"AccountManager.maybeUpdateSignedPreKey(ACI): 17 confirmed keys, most recent was created 1692611253177. No need to update."}
{"level":30,"time":"2023-08-21T16:52:44.040Z","msg":"maybeUpdateKeys(ACI): No new keys are needed; returning early"}
{"level":30,"time":"2023-08-21T16:52:44.040Z","msg":"GET (REST) https://chat.signal.org/v2/keys?identity=pni"}
{"level":30,"time":"2023-08-21T16:52:44.155Z","msg":"GET (REST) https://chat.signal.org/v2/keys?identity=pni 200 Success"}
{"level":30,"time":"2023-08-21T16:52:44.155Z","msg":"maybeUpdateKeys(PNI): Server prekey count is 0, generating a new set"}
{"level":30,"time":"2023-08-21T16:52:44.155Z","msg":"AccountManager.generateNewPreKeys(PNI): Generating 100 new keys starting at 1982801"}
{"level":30,"time":"2023-08-21T16:52:44.247Z","msg":"maybeUpdateKeys(PNI): Server kyber prekey count is 0, generating a new set"}
{"level":30,"time":"2023-08-21T16:52:44.247Z","msg":"AccountManager.generateNewKyberPreKeys(PNI): Generating 100 new keys starting at 106"}
{"level":50,"time":"2023-08-21T16:52:44.247Z","msg":"UpdateKeysListener.run failure - trying again in five minutes Error: getIdentityKeyOrThrow: Missing identity key\n    at lL.getIdentityKeyOrThrow ([REDACTED]/resources/app.asar/preload.bundle.js:76:35696)\n    at lL.generateNewKyberPreKeys ([REDACTED]/resources/app.asar/preload.bundle.js:76:36535)\n    at [REDACTED]/resources/app.asar/preload.bundle.js:76:37221\n    at async run ([REDACTED]/resources/app.asar/preload.bundle.js:71:661790)\n    at async [REDACTED]/resources/app.asar/preload.bundle.js:71:661823\n    at async run ([REDACTED]/resources/app.asar/preload.bundle.js:43:30716)"}
scottnonnenberg-signal commented 10 months ago

@palant What version are you running?

palant commented 10 months ago

@scottnonnenberg-signal I was on 6.28.0 when the problem started occurring. Upgraded to 6.29.0 by now and cleaned out the database, can no longer see this in the logs.

scottnonnenberg-signal commented 10 months ago

Ah. 6.29.0 would have cleared out the preKeys table for you. Sorry to hear that you deleted all your data.

palant commented 10 months ago

I didn’t. I only cleared the preKeys table.

saitejal commented 10 months ago

While v6.29.0 clears the preKeys, it doesn't compact the DB. I had to run VACUUM; to shrink the DB. Thanks vbtc-exchange!

scottnonnenberg-signal commented 10 months ago

All: please do not delete your preKeys table manually. You will very likely see decryption errors in the future. For those who have updated to 6.29.0 but still see errors on startup, please provide a debug log either here, to me directly or to support@signal.org. Thanks!

scottnonnenberg-signal commented 10 months ago

I've deleted a number of comments because the suggested workarounds are quite harmful.

Please help us help you and provide your debug logs if you continue to have difficulty. If the application isn't starting, we can help you get debug logs directly from disk.

signal-5flat commented 10 months ago

Since you did not delete that comment, I assume VACUUM is ok. Maybe that should be added to the next update. My DB is quite large.

sevens commented 10 months ago

Upgraded to 6.29.0. Got a new/previously unseen 'Optimizing application' splashscreen. After a few minutes got an exit with Trace/breakpoint trap again. Last few lines from output in terminal of first start:

{"level":30,"time":"2023-08-23T09:49:18.076Z","msg":"main window is ready-to-show"}
{"level":30,"time":"2023-08-23T09:49:19.266Z","msg":"sql.initialize is taking more than three seconds; showing loading dialog"}
{"level":30,"time":"2023-08-23T09:49:19.329Z","msg":"got fast theme-setting value system"}
{"level":30,"time":"2023-08-23T09:49:32.616Z","msg":"MainSQL: updateToSchemaVersion87: Found 6846892 keys"}
{"level":30,"time":"2023-08-23T09:50:09.821Z","msg":"MainSQL: updateToSchemaVersion87: Found 6846800 preKeys for PNI"}
Trace/breakpoint trap

Second start getting same messages about keys, same 'Optimizing application' splashscreen. Same result, same output in terminal (with the exact same amount of keys).

@scottnonnenberg-signal : I've sent my debug logs to the e-mail address listed in your GitHub profile.

petterreinholdtsen commented 10 months ago

I see the same thing. Got around it by doing apt install signal-desktop=6.28.0 to revert to the last working version.

vbtc-exchange commented 10 months ago

I've deleted a number of comments because the suggested workarounds are quite harmful.

I did ask about the implications of clearing out this table. You yourself claimed 6.29.0 clears out the preKeys table (no, it does not, unless by clears out you mean something else).

I had critical communications that needed to be done, the steps I provided allowed those critical communications to be done.

Why is the database of a this messaging application growing to those ridiculous sizes anyway? Is it really required to store 2+gb of data to allow secure communications?

Signal-desktop has had these memory issues and unaccounted high CPU usage for years, seemingly unable to get them under control.

Please help us help you and provide your debug logs if you continue to have difficulty. If the application isn't starting, we can help you get debug logs directly from disk.

Here is a complete log form running 6.29.0 (before manually clearing the preKeys table):

Set Windows Application User Model ID (AUMID) { AUMID: 'org.whispersystems.signal-desktop' }
NODE_ENV production
NODE_CONFIG_DIR /usr/lib/signal-desktop/resources/app.asar/config
NODE_CONFIG {}
ALLOW_CONFIG_MUTATIONS undefined
HOSTNAME undefined
NODE_APP_INSTANCE undefined
SUPPRESS_NO_CONFIG_WARNING undefined
SIGNAL_ENABLE_HTTP undefined
userData: [redacted]/.config/Signal
config/get: Successfully read user config file
config/get: Successfully read ephemeral config file
making app single instance
{"level":30,"time":"[redacted]:16:19.004Z","msg":"app.ready: hour cycle preference: UnknownPreference"}
{"level":30,"time":"[redacted]:16:19.004Z","msg":"app.ready: preferred system locales: en-US, en"}
{"level":30,"time":"[redacted]:16:19.005Z","msg":"locale: Supported locales: af-ZA, ar, az-AZ, bg-BG, bn-BD, bs-BA, ca, cs, da, de, el, en, es, et-EE, eu, fa-IR, fi, fr, ga-IE, gl-ES, gu-IN, he, hi-IN, hr-HR, hu, id, it, ja, ka-GE, kk-KZ, km-KH, kn-IN, ko, ky-KG, lt-LT, lv-LV, mk-MK, ml-IN, mr-IN, ms, my, nb, nl, pa-IN, pl, pt-BR, pt-PT, ro-RO, ru, sk-SK, sl-SI, sq-AL, sr, sv, sw, ta-IN, te-IN, th, tl-PH, tr, ug, uk-UA, ur, vi, yue, zh-CN, zh-HK, zh-Hant"}
{"level":30,"time":"[redacted]:16:19.005Z","msg":"locale: Preferred locales:  en-US, en"}
{"level":30,"time":"[redacted]:16:19.006Z","msg":"locale: Matched locale: en"}
{"level":40,"time":"[redacted]:16:19.026Z","msg":"intl.onWarn [@formatjs/intl] \"defaultRichTextElements\" was specified but \"message\" was not pre-compiled. \nPlease consider using \"@formatjs/cli\" to pre-compile your messages for performance.\nFor more details see https://formatjs.io/docs/getting-started/message-distribution"}
{"level":30,"time":"[redacted]:16:19.026Z","msg":"locale: Text info direction for en: ltr"}
{"level":30,"time":"[redacted]:16:19.027Z","msg":"app ready"}
{"level":30,"time":"[redacted]:16:19.028Z","msg":"starting version 6.29.0"}
{"level":30,"time":"[redacted]:16:19.028Z","msg":"media access status [object Undefined] [object Undefined]"}
{"level":30,"time":"[redacted]:16:19.029Z","msg":"got fast theme-setting value light"}
{"level":30,"time":"[redacted]:16:19.030Z","msg":"got fast theme-setting value light"}
{"level":30,"time":"[redacted]:16:19.030Z","msg":"got fast spellcheck setting false"}
{"level":30,"time":"[redacted]:16:19.031Z","msg":"getSystemTraySetting had no flags and did no DB lookups. Returning DoNotUseSystemTray"}
{"level":30,"time":"[redacted]:16:19.031Z","msg":"Initializing BrowserWindow config: {\"show\":false,\"width\":1920,\"height\":1127,\"minWidth\":300,\"minHeight\":200,\"autoHideMenuBar\":false,\"titleBarStyle\":\"default\",\"titleBarOverlay\":false,\"backgroundColor\":\"#3a76f0\",\"webPreferences\":{\"devTools\":false,\"spellcheck\":false,\"enableBlinkFeatures\":\"CSSPseudoDir,CSSLogical\",\"nodeIntegration\":false,\"nodeIntegrationInWorker\":false,\"sandbox\":false,\"contextIsolation\":true,\"preload\":\"[REDACTED]/preload.bundle.js\",\"backgroundThrottling\":true,\"enablePreferredSizeMode\":true,\"disableBlinkFeatures\":\"Accelerated2dCanvas,AcceleratedSmallCanvases\"},\"icon\":\"[REDACTED]/images/signal-logo-desktop-linux.png\",\"x\":0,\"y\":29}"}
{"level":30,"time":"[redacted]:16:19.048Z","msg":"spellcheck: user locales: [\"en-US\",\"en\"]"}
{"level":30,"time":"[redacted]:16:19.048Z","msg":"spellcheck: available spellchecker languages: [\"af\",\"bg\",\"ca\",\"cs\",\"cy\",\"da\",\"de\",\"de-DE\",\"el\",\"en\",\"en-AU\",\"en-CA\",\"en-GB\",\"en-GB-oxendict\",\"en-US\",\"es\",\"es-419\",\"es-AR\",\"es-ES\",\"es-MX\",\"es-US\",\"et\",\"fa\",\"fo\",\"fr\",\"fr-FR\",\"he\",\"hi\",\"hr\",\"hu\",\"hy\",\"id\",\"it\",\"it-IT\",\"ko\",\"lt\",\"lv\",\"nb\",\"nl\",\"pl\",\"pt\",\"pt-BR\",\"pt-PT\",\"ro\",\"ru\",\"sh\",\"sk\",\"sl\",\"sq\",\"sr\",\"sv\",\"ta\",\"tg\",\"tr\",\"uk\",\"vi\"]"}
{"level":30,"time":"[redacted]:16:19.048Z","msg":"spellcheck: setting languages to: [\"en-US\",\"en\"]"}
{"level":30,"time":"[redacted]:16:19.056Z","msg":"MainSQL: updateSchema:\n  Current user_version: 86;\n  Most recent db schema: 87;\n  SQLite version: 3.39.4;\n  SQLCipher version: 4.5.3 community;\n  (deprecated) schema_version: 371;\n"}
{"level":30,"time":"[redacted]:16:19.322Z","msg":"got fast theme-setting value light"}
{"level":30,"time":"[redacted]:16:19.687Z","msg":"MainSQL: updateToSchemaVersion87: Found 5686749 keys"}
{"level":30,"time":"[redacted]:16:19.784Z","msg":"main window is ready-to-show"}
{"level":30,"time":"[redacted]:16:22.032Z","msg":"sql.initialize is taking more than three seconds; showing loading dialog"}
{"level":30,"time":"[redacted]:16:22.060Z","msg":"got fast theme-setting value light"}
{"level":30,"time":"[redacted]:16:29.068Z","msg":"MainSQL: updateToSchemaVersion87: Found 5686600 preKeys for PNI"}
zsh: trace trap (core dumped)  signal-desktop
scottnonnenberg-signal commented 10 months ago

@vbtc-exchange When Desktop 'clears out' the table, it does work to preserve the necessary items in it instead of deleting everything.

From your log output, it does look like your computer was unable to work with the database at the size it was.

Anyone still struggling with this: Please consider reaching out to me directly, or to support@signal.org and we can work more closely with you.

vbtc-exchange commented 10 months ago

I keep a copy of the original database/state, while the states may now be diverging and some loss of the complete record at his point has to be assumed, is there anything that can be done by me/everyone else in the same situation to the original state to restore its operation and clean out the database of not needed keys so that the integrity of future communications can be restored?

petterreinholdtsen commented 10 months ago

Here is stdout on a test run of version 6.29.0, ending with the crash:

Set Windows Application User Model ID (AUMID) { AUMID: 'org.whispersystems.signal-desktop' }
NODE_ENV production
NODE_CONFIG_DIR /opt/Signal/resources/app.asar/config
NODE_CONFIG {}
ALLOW_CONFIG_MUTATIONS undefined
HOSTNAME undefined
NODE_APP_INSTANCE undefined
SUPPRESS_NO_CONFIG_WARNING undefined
SIGNAL_ENABLE_HTTP undefined
userData: /home/user/.config/Signal
config/get: Successfully read user config file
config/get: Successfully read ephemeral config file
making app single instance
{"level":30,"time":"2023-08-23T16:54:54.774Z","msg":"app.ready: hour cycle preference: UnknownPreference"}
{"level":30,"time":"2023-08-23T16:54:54.777Z","msg":"app.ready: preferred system locales: en-US, en"}
{"level":30,"time":"2023-08-23T16:54:54.777Z","msg":"locale: Supported locales: af-ZA, ar, az-AZ, bg-BG, bn-BD, bs-BA, ca, cs, da, de, el, en, es, et-EE, eu, fa-IR, fi, fr, ga-IE, gl-ES, gu-IN, he, hi-IN, hr-HR, hu, id, it, ja, ka-GE, kk-KZ, km-KH, kn-IN, ko, ky-KG, lt-LT, lv-LV, mk-MK, ml-IN, mr-IN, ms, my, nb, nl, pa-IN, pl, pt-BR, pt-PT, ro-RO, ru, sk-SK, sl-SI, sq-AL, sr, sv, sw, ta-IN, te-IN, th, tl-PH, tr, ug, uk-UA, ur, vi, yue, zh-CN, zh-HK, zh-Hant"}
{"level":30,"time":"2023-08-23T16:54:54.777Z","msg":"locale: Preferred locales:  en-US, en"}
{"level":30,"time":"2023-08-23T16:54:54.782Z","msg":"locale: Matched locale: en"}
MESA-INTEL: warning: Ivy Bridge Vulkan support is incomplete
{"level":40,"time":"2023-08-23T16:54:54.847Z","msg":"intl.onWarn [@formatjs/intl] \"defaultRichTextElements\" was specified but \"message\" was not pre-compiled. \nPlease consider using \"@formatjs/cli\" to pre-compile your messages for performance.\nFor more details see https://formatjs.io/docs/getting-started/message-distribution"}
{"level":30,"time":"2023-08-23T16:54:54.848Z","msg":"locale: Text info direction for en: ltr"}
{"level":30,"time":"2023-08-23T16:54:54.851Z","msg":"app ready"}
{"level":30,"time":"2023-08-23T16:54:54.851Z","msg":"starting version 6.29.0"}
{"level":30,"time":"2023-08-23T16:54:54.852Z","msg":"media access status [object Undefined] [object Undefined]"}
{"level":30,"time":"2023-08-23T16:54:54.856Z","msg":"got fast theme-setting value light"}
{"level":30,"time":"2023-08-23T16:54:54.862Z","msg":"got fast theme-setting value light"}
{"level":30,"time":"2023-08-23T16:54:54.862Z","msg":"got fast spellcheck setting false"}
{"level":30,"time":"2023-08-23T16:54:54.864Z","msg":"getSystemTraySetting had no flags and did no DB lookups. Returning DoNotUseSystemTray"}
{"level":30,"time":"2023-08-23T16:54:54.865Z","msg":"Initializing BrowserWindow config: {\"show\":false,\"width\":781,\"height\":555,\"minWidth\":300,\"minHeight\":200,\"autoHideMenuBar\":false,\"titleBarStyle\":\"default\",\"titleBarOverlay\":false,\"backgroundColor\":\"#3a76f0\",\"webPreferences\":{\"devTools\":false,\"spellcheck\":false,\"enableBlinkFeatures\":\"CSSPseudoDir,CSSLogical\",\"nodeIntegration\":false,\"nodeIntegrationInWorker\":false,\"sandbox\":false,\"contextIsolation\":true,\"preload\":\"[REDACTED]/preload.bundle.js\",\"backgroundThrottling\":true,\"enablePreferredSizeMode\":true,\"disableBlinkFeatures\":\"Accelerated2dCanvas,AcceleratedSmallCanvases\"},\"icon\":\"[REDACTED]/images/signal-logo-desktop-linux.png\",\"x\":233,\"y\":29}"}
libva error: /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so init failed
{"level":30,"time":"2023-08-23T16:54:54.954Z","msg":"spellcheck: user locales: [\"en-US\",\"en\"]"}
{"level":30,"time":"2023-08-23T16:54:54.954Z","msg":"spellcheck: available spellchecker languages: [\"af\",\"bg\",\"ca\",\"cs\",\"cy\",\"da\",\"de\",\"de-DE\",\"el\",\"en\",\"en-AU\",\"en-CA\",\"en-GB\",\"en-GB-oxendict\",\"en-US\",\"es\",\"es-419\",\"es-AR\",\"es-ES\",\"es-MX\",\"es-US\",\"et\",\"fa\",\"fo\",\"fr\",\"fr-FR\",\"he\",\"hi\",\"hr\",\"hu\",\"hy\",\"id\",\"it\",\"it-IT\",\"ko\",\"lt\",\"lv\",\"nb\",\"nl\",\"pl\",\"pt\",\"pt-BR\",\"pt-PT\",\"ro\",\"ru\",\"sh\",\"sk\",\"sl\",\"sq\",\"sr\",\"sv\",\"ta\",\"tg\",\"tr\",\"uk\",\"vi\"]"}
{"level":30,"time":"2023-08-23T16:54:54.954Z","msg":"spellcheck: setting languages to: [\"en-US\",\"en\"]"}
{"level":30,"time":"2023-08-23T16:54:54.972Z","msg":"MainSQL: updateSchema:\n  Current user_version: 86;\n  Most recent db schema: 87;\n  SQLite version: 3.39.4;\n  SQLCipher version: 4.5.3 community;\n  (deprecated) schema_version: 322;\n"}
{"level":30,"time":"2023-08-23T16:54:55.810Z","msg":"got fast theme-setting value light"}
{"level":30,"time":"2023-08-23T16:54:57.201Z","msg":"main window is ready-to-show"}
{"level":30,"time":"2023-08-23T16:54:57.856Z","msg":"sql.initialize is taking more than three seconds; showing loading dialog"}
{"level":30,"time":"2023-08-23T16:54:57.954Z","msg":"got fast theme-setting value light"}
{"level":30,"time":"2023-08-23T16:55:06.525Z","msg":"MainSQL: updateToSchemaVersion87: Found 2840868 keys"}
{"level":30,"time":"2023-08-23T16:55:31.925Z","msg":"MainSQL: updateToSchemaVersion87: Found 2840700 preKeys for PNI"}
Trace/breakpoint trap (core dumped)

Happy to provide help if I know what to look for.

scottnonnenberg-signal commented 10 months ago

Hey everyone - we've just released v6.29.1. We believe that it will fix this issue, even for systems currently crashing on v6.29.0. Please install it and let us know what you see!

saitejal commented 10 months ago

Would it be possible to add a routine to VACUUM; the DB periodically (and/or before closing the DB)?

petterreinholdtsen commented 10 months ago

[Scott Nonnenberg]

Hey everyone - we've just released v6.29.1. We believe that it will fix this issue, even for systems currently crashing on v6.29.0. Please install it and let us know what you see!

This version did no longer crash for me. Saw some lines about "MainSQL: updateToSchemaVersion87: Deleted 10000 items" the first run which did not show up on the second. The size of the sqlite file did not go down, though, still 1.6 GiB.

-- Happy hacking Petter Reinholdtsen

sevens commented 10 months ago

Upgraded to 6.29.1 and after a few minutes of updateToSchemaVersion87: Deleted 10000 items lines scrolling by, Signal now started and works again. Thanks for fixing!

My database didn't shrink either (didn't expect it to, noting it here as just an extra data point).

scottnonnenberg-signal commented 10 months ago

Thanks for verifying the fix, everyone! I'm going to close this.

We'll think about our approach to VACUUM; it's not something we want to do frequently, since it's a very heavyweight command.

petterreinholdtsen commented 10 months ago

[Scott Nonnenberg]

We'll think about our approach to VACUUM; it's not something we want to do frequently, since it's a very heavyweight command.

I thought it was mostly a heavyweight command if it was done infrequently?

-- Vennlig hilsen Petter Reinholdtsen

mkincaid commented 10 months ago

Hey dev team, and fellow users, Is it possible that there's a similar issue with the kyberPreKeys table needing to be cleaned up, as well as preKeys? I'm still seeing the same symptoms (unable to launch, with a crash, once memory consumption reaches about 5GB), even after updating to 6.29.1 (macOS 11.7.9, if that matters). My SQLite DB is quite large at 25.5G. VACUUM did not help (essentially no change in the file size).

Support team asked me to provide row counts for all the tables. The relevant ones are: preKeys 225 (logs also indicate the schema migration was successful, and it seems that from this count, the old ones were cleaned out with the 6.29.1 update as intended) kyberPreKeys 3034952 (this seems to be what is now causing the database to still be huge?)

Anyone else seeing the issue persist in 6.29.1?

trevor-signal commented 10 months ago

@mkincaid yes, we are aware of the issue and have a fix in beta now which should fix this issue. Sorry for the trouble here!

saitejal commented 10 months ago

VACUUM only helps after the migration is complete to compact the DB (reclaim the disk space). Based on the previous texts in this thread, its recommended not to manually drop anything.

mkincaid commented 10 months ago

So, progress here -- after updating to 6.30.1, I can get as far as the migration steps, but I get a failure in migrating to schema version 88 which ultimately still causes the app to crash after a minute or two of increasing memory usage: {"level":30,"time":"2023-09-08T15:24:52.270Z","msg":"MainSQL: updateToSchemaVersion88: updating 28899 signedPreKeys"}

When I run the development version and look at the command line output, it's clear that this is still a heap space issue: FATAL ERROR: Scavenger: semi-space copy Allocation failed - JavaScript heap out of memory

Seems like a catch-22. I assume once I get to migration 920, my signedPreKeys and kyberPreKeys will get cleaned up and vacuumed, which should solve the problem for good, but migration 88's change to signedPreKeys is failing first.

Would it possibly make sense to do this in the opposite order? eg, drop the unused/excess keys first before doing the memory-intensive step of updating them? Or maybe chunk this step somehow to use less memory inflight, similar to how the deletions are chunked?

scottnonnenberg-signal commented 9 months ago

@mkincaid Have you tried 6.30.2? We believe it will fix your startup crash.

mkincaid commented 9 months ago

6.30.2 did the trick! After the cleanup/migration my database shrunk from 24G to 250M. Thanks so much for getting this fix in quickly.

A couple of notes for others who might be experiencing the same-- The full migration took about 3 hours on a slow-ish Mac laptop. About an hour and a half of that was deleting the kyberPreKeys 10k at a time, which was easy to follow in main.log, but then at the end of migration 87 it also took about 45 minutes to complete the final signedPreKeys step (dropping the index and column created during the migration). During this time there was no periodic logging, although I could see from the time stamp that the db.sqlite file was being modified (so it wasn't hung). So, don't give up or abort if the process is slow.

Also, after the migration completed, for some reason the UI window stayed in the "three dots" loading stage. I had to quit and relaunch Signal one more time for everything to load successfully.