Open apiziali opened 2 years ago
I have same issue on Debian 11 with Xfce. Screen lock will also cause same issue. Can provide debug log if required.
@apiziali So, at a high level I can say that when Desktop is not staying up to date with incoming messages after resume, it is in fact unable to access its database at all (we timeout after two minutes waiting to talk to our database). It makes me think that on your machine, inter-process communication (IPC, a core part of our Electron foundations) is broken. This is 100% reproducible for you? Can you tell us more about your linux install, window manager, etc.?
@timp34 Can we get a log from you?
Scott,
I am not familiar with the implementation of Signal, so I cannot answer your question about its database. However, I can say that sometimes, but not always, messages that have been delivered to Signal Android on my phone are not delivered to Signal Desktop on my workstation after a resume.
These are the details of my Linux desktop:
OS : Ubuntu 18.04.6 LTS
uname -a : Linux elijah 5.4.0-124-generic #140~18.04.1-Ubuntu
SMP Fri Aug 5 11:43:34 UTC 2022 x86_64 x86_64
x86_64 GNU/Linux
Window Mgr : Stock Gnome 3
Workstation : Dell Precision 3630 Tower Workstation
CPU : Intel Core i5-8500 CPU (6 core, 3.00GHz)
Memory : 32 GB
I posted the following log files at the time I posted the bug report (September 1):
https://debuglogs.org/desktop/5.56.0/33574877e6a4c8cdbd74fec3f7f09269a6c0281d6bb5a25e0a0db48626841da5.gz
https://debuglogs.org/android/5.47.3/83a1078f72b089d5d0f5996e170a937b768ab035e7ab9456d2141516ce61d8a4
debuglog.txt : https://github.com/signalapp/Signal-Desktop/files/9474416/debuglog.txt
How else may I assist?
Thanks!
On 9/14/22 9:28 AM, Scott Nonnenberg wrote:
@apiziali https://github.com/apiziali So, at a high level I can say that when Desktop is not staying up to date with incoming messages after resume, it is in fact unable to access its database at all (we timeout after two minutes waiting to talk to our database). It makes me think that on your machine, inter-process communication (IPC, a core part of our Electron foundations) is broken. This is 100% reproducible for you? Can you tell us more about your linux install, window manager, etc.?
@timp34 https://github.com/timp34 Can we get a log from you?
— Reply to this email directly, view it on GitHub https://github.com/signalapp/Signal-Desktop/issues/6103#issuecomment-1247017970, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGLF2LR4KKGUZFCZQCRXOI3V6H4LNANCNFSM6AAAAAAQC4E76I. You are receiving this because you were mentioned.Message ID: @.***>
-- | @.** ____------+------____ / \ ---*
PGP Key: h t t p s : / / b i t . l y / A n d y P G P K e y
"Computers in the future may weigh no more than 1.5 tons." -- Popular Mechanics, forecasting the relentless march of science, 1949.
Here is my log. This is whilst signal appears to be working ok. Will add another shortly when it is "not" working correctly... :) https://debuglogs.org/desktop/5.58.0/659cdb98ebc7cdf5e53a27ac6529ddb95440c79a642a1a4850517500cf6ffbfb.gz
Rather than habitually restarting Signal Desktop each time I resume the workstation, I just resumed without restarting it. After sending a "Note to Self" from my phone, it did not appear on the desktop. I immediately captured a debug log from the desktop:
here is debug log for desktop that is now not sending or receiving after computer has been "suspended". Linked phone has new messages which are not on desktop version, hope this helps... https://debuglogs.org/desktop/5.58.0/616fb48cf70528e9e7a6d3b7cf68e4a50e917367c37a9aad34b79ec1f5b5989c.gz
I just noticed another clue. When Signal Desktop does not reflect recent messages delivered to Signal Android, the Ubuntu notification system still displays the first line or two of new messages. If one of those summaries is clicked, a GSConnect window, "Messaging: Andy's Phone," opens. Its left pane displays "No Conversations" which its right pane displays "No conversation selected. Select or start a conversation." GSConnect notifications are enabled in the Gnome notification settings. However, in Tweaks, GSConnect is turned off.
This may just reflect that GSConnect is reporting new messages received by Signal Android, so I could disable the GSConnect notifications on the Ubuntu workstation.
I've disabled GSConnect notifications on my workstation until we resolve this Signal Desktop synchronization failure.
FYI, GSConnect (https://bit.ly/3BrRUIg) is a re-implementation of KDE Connect for Gnome, which provides phone/desktop communication and control.
Very interesting! Thank you for investigating it. I'll file a bug internally, but would you care to submit this to Electron as well? It sounds like they might be interested in it too!
It looks to me as though GSConnect reporting new Signal Android messages, even though GSConnect is disabled in Gnome, is a red herring with regard to Signal Desktop and Signal Android losing synchronization. We need to understand why synchronization is lost.
@apiziali @timp34 Hey there! We now have an Electron Fiddle ready for you to test.
View -> Toggle Developer Tools
and watch the IPC ping
/pong
events go by.ping
/pong
events continue, or just ping
with no pong
. If just ping
then IPC has broken, just like it has with your Signal Desktop.You can try lots of different versions of Electron really easily within Electron Fiddle, and you can also create packaged versions to see if that makes a difference too (Tasks -> Package Fiddle
).
Please let us know what you find! This is the kind of thing that will give us really useful information to pass to Electron so they can get things fixed!
Fantastic! Unfortunately, I am tied up for about ten days so I will pick this up around October 3.
Thank you!
On September 23, 2022 4:58:05 PM MST, Scott Nonnenberg @.***> wrote:
@apiziali @timp34 Hey there! We now have an Electron Fiddle ready for you to test.
- First, download Electron Fiddle here: https://www.electronjs.org/fiddle
- Then load up this Gist inside Electron Fiddle: https://gist.github.com/scottnonnenberg-signal/fc5174e6015c61fdae49ffe767d441bc
- Then start it up on a recent version of Electron. Currently Signal Desktop is on 20.1.4.
- Open the dev tools with
View -> Toggle Developer Tools
and watch the IPCping
/pong
events go by.- Suspend your linux machine, then resume it. See if the
ping
/pong
events continue, or justping
with nopong
. If justping
then IPC has broken, just like it has with your Signal Desktop.You can try lots of different versions of Electron really easily within Electron Fiddle, and you can also create packaged versions to see if that makes a difference too (
Tasks -> Package Fiddle
).Please let us know what you find! This is the kind of thing that will give us really useful information to pass to Electron so they can get things fixed!
-- Andrew Piziali @.***> h t t p s : / / b i t . l y / A n d y P G P K e y
"The Great Reset represents the development of the Chinese system in the West, but in reverse. Whereas the Chinese political class began with a socialist political system and then introduced privately held for-profit production, the West began with capitalism and is now implementing a Chinese-style political system." -- Michael Rectenwald, "What Is The Great Reset?," "Imprimis," December 2021
Scott, following your instructions, I ran the test using Electron 20.1.4 and annotated the results in your message below. As you can see, without even introducing suspend/resume, no "pong" event are reported in the console:
=============== Hey there! We now have an Electron Fiddle ready for you to test.
First, download Electron Fiddle here: https://www.electronjs.org/fiddle -- Done
Then load up this Gist inside Electron Fiddle: https://gist.github.com/scottnonnenberg-signal/fc5174e6015c61fdae49ffe767d441bc -- Downloaded -- Unzipped
Then start it up on a recent version of Electron. Currently Signal Desktop is on 20.1.4. -- Started Electron Fiddle ("electron-fiddle >/tmp/ef.log 2>&1 &") -- Downloaded Electron version 20.1.4 from File > Preferences > Electron -- Changed the version of Electron to 20.1.4 -- Load the Gist: File > Open > /home/andy/Signal Suspend-Resume Failure/Gist/\ fc5174e6015c61fdae49ffe767d441bc-8ff8c17d37c693586c08ea16704a49ca259a23af -- Open the console -- Run the Gist: [Run]
Open the dev tools with View -> Toggle Developer Tools and watch the IPC ping/pong events go by. -- Only "pings" are logged in the console, even though no suspend/resume cycles have been initiated:
------------
Console ready
Saving files to temp directory...
Saved files to /tmp/tmp-18764-ZrmqHuvlUq40
Electron v20.2.0 started.
main ping #1
main ping #2
main ping #3
main ping #4
main ping #5
main ping #6
main ping #7
main ping #8
main ping #9
main ping #10
main ping #11
main ping #12
main ping #13
main ping #14
main ping #15
main ping #16
main ping #17
main ping #18
main ping #19
main ping #20
main ping #21
main ping #22
main ping #23
main ping #24
main ping #25
main ping #26
main ping #27
main ping #28
main ping #29
main ping #30
main ping #31
main ping #32
main ping #33
main ping #34
Electron exited with code 0.
------------
...
@apiziali Your pasted log output is from the main process, which isn't where you'll see the 'pong' result. You need to start up the app, and open the dev tools with View -> Toggle Developer Tools
and you should see both 'ping' and 'pong.'
Scott,
There are no “pings,” nor “pongs” logged in the developer console (see attached).
I have not changed any of the default Electron Fiddle settings.
Also, "Toggle Developer Tools" is at: Help > Toggle Developer Tools
I think you need to run the Electron Fiddle and enable the dev tools in the running application, not the Electron Fiddle editor application.
I have never used Electron Fiddle before, so I do not know how to enable the developer tools in the running application.
Open the View
menu and click Toggle Developer Tools
in the running application when you choose to Run the code with Electron Fiddle. You can also package the application up and run it as a standalone application, maybe more easily tested.
Okay. I now realize that you are referring to View > Toggle Developer Tools in the application (Chrome?) menus, not in the Electron Fiddle menus. I've run four suspend/resume cycles, at 1-, 2-, 4- and 8-minute intervals between the suspend and resume. In each case, the ping/pong continues. What log files do you want me to capture, and how do I do so? Thanks!
@apiziali The best thing to do would be to do whatever seemed to cause the problem with Signal Desktop. We're looking to replicate that suspend/resume problem, which we believe would mean that the ping/pong would become just ping.
Curiously, while running these tests, without restarting Signal, it continued to remain synchronized with Signal Android, as indicated by "Message to self" and a message sent from another user.
Alright, I will just leave this application running as we use the workstation normally, suspending and resuming it throughout the day. Shall I just copy-and-paste the application console log when Signal Desktop loses synchronization with Signal Android?
@apiziali Certainly we're interested in whether this behavior still happens for you on Signal Desktop. The key bit of data we're looking for is whether, when it happens on Signal Desktop, it also happens on Electron Fiddle. We're looking for the minimum bit of code that can cause the weirdness on your system to trigger.
Signal Desktop has not lost synchronization with Signal Android since I started running your Gist. I then stopped the Gist in Electron Fiddle, but continued using the workstation and Signal as usual. Still, Signal Desktop remains synchronized. I quit the Electron Fiddle application and will monitor the behavior of Signal Desktop during the next few days of multiple suspend/resume cycles.
This is a classic problem, right? You begin to investigate and diagnose an issue and it ceases to exist. Causal? Who knows?
@apiziali It would be preferable to us if you kept the electron fiddle app running in the background all the time along with Signal Desktop, just to gather as much data as possible.
Okay. I'll start it up again and we'll see if Signal loses synchronization. If it does, we'll have the fiddle app log.
BTW, where will I find this log file?
@apiziali When Signa Desktop stops working, all we need is the Signal Desktop log and the yes/no of whether Electron Fiddle is doing the full ping/pong in its dev tools, or just ping.
Good news! My wife attempted to use Signal Desktop on Oct. 7, 2022 at 1800 MST (UTC-0700) without success. The fiddle application was running. I attached the debug log from her running Signal ("Debbie Signal Debug Log.txt"). You might note from the initial post of this bug report that this is a multiuser workstation, two users always logged in, each running their own Signal Desktop.
In the fiddle app, after a sequence of ping/pongs from #1 to #243, there are only pings starting at #244. However, I believe that anomaly occurred long before this Signal malfunction. By #13440, the ping/pong sequence continues as normal. Between these times, at the #6000 range of fiddle app messages, there are only "pong" messages!
Lastly, I just verified that my wife's Signal Desktop is still not synchronized with her Signal Android at 1930 MST.
Okay. I just saw Signal Desktop and Signal Android lose synchronization on my Ubuntu login. The fiddle app is still ping/pong-ing along, but a note-to-self from the phone was not delivered to the desktop. The Signal Desktop debug log ("Andy Signal Debug Log.txt") and Signal Android debug log ("Andy Signal Android Debug Log.zip") are attached.
[Andy Signal Debug Log.txt](https://github.com/signalapp/Signal-Desktop/files/9740446/Andy.Signal.Debug.Log.txt]
[Andy Signal Android Debug Log.zip](https://github.com/signalapp/Signal-Desktop/files/9740451/Andy.Signal.Android.Debug.Log.zip]
Another desynchronization this afternoon. The fiddle app was continuously ping/pong-ing but Signal Desktop did not receive a note-to-self sent from Signal Android. The desktop and Android log files are attached.
Andy Signal Android Debug Log 221009.zip Andy Signal Desktop Debug Log 221009.txt
This desynchronization was revealed by the failure of a note-to-self to be delivered from Signal Android to Signal Desktop. Again, the running fiddle app continued to display a continuous sequence of ping-pong events. The desktop and Android log files are attached.
Andy Signal Android Debug Log 221012.zip Andy Signal Desktop Debug Log 221012.txt
@scottnonnenberg, I haven't heard from you in a week, after posting a number of failure cases and their associated log files. What's up?
I'd like to know whether or not this issue is still on anybody's radar. Since October 14, 2022, I have resumed restarting Signal Desktop after each resume in order for it to remain synchronized with Signal Android.
It is still on our radar, it was interrupted by other ongoing efforts, apologies for the delay
Thank you Jamie. I will stand by, ready to assist, and continue restarting Signal Desktop after each workstation resume.
Hello, I'm still seeing this behavior on Manjaro/xfce. I'm on Signal Desktop 6.2.0
@apiziali @MikeWallaceDev @timp34 If you could provide detailed computer specs, that would help us enter a bug on Electron. Our current theory is that we're not getting powerMonitor suspend/resume events as expected, which would allow us to better handle tasks started right before your computer goes to sleep. Today, they throw all sorts of errors because they take too long across your sleep/resume cycle.
Update: We have other theories. We have at least one log on a recent build with suspend/resume log lines. We're gonna try to reproduce it ourselves. It would help us if you track it down - for example, does it only happen if you sleep for over 20 minutes, or does it happen if you manually sleep and come back in 5 minutes?
@scottnonnenberg-signal
Here is the log when I come back to my desk and the app isn't working correctly. https://debuglogs.org/desktop/6.2.0/e53cda133d148923bc382e0fcd0f66f6fe682af4f28ada644d335c1a6aecb3e5.gz
Here is the log after I restarted the app, and the messages are all visible. https://debuglogs.org/desktop/6.2.0/f33f10f78c31d95eda7fd1fc59880eb07d4d4f459f78003c7037311292e84c71.gz
I don't know about the 5 minute thing. Sorry.
@MikeWallaceDev It looks like your computer was asleep for a very short time here - do you remember what was going on there?
2023-01-28T04:03:12.904Z
is Fri Jan 27 2023 20:03:12 GMT-0800 (Pacific Standard Time)
.
// A few bigger breaks between log lines
INFO 2023-01-28T04:03:12.904Z checkExpiringMessages: found no messages to expire
INFO 2023-01-28T04:03:31.865Z WebSocketResources: Sending a keepalive message
INFO 2023-01-28T04:03:42.127Z WebSocketResources: Sending a keepalive message
INFO 2023-01-28T04:03:57.529Z MessageReceiver: got request PUT /api/v1/message
...
// Then about two minutes later we have tasks blocked for just under two minutes
WARN 2023-01-28T04:05:14.660Z TaskWithTimeout: SQL channel call (getNextAttachmentDownloadJobs) has been running for 118956ms
WARN 2023-01-28T04:05:14.661Z TaskWithTimeout: SQL channel call (commitDecryptResult) has been running for 77050ms
INFO 2023-01-28T04:05:27.643Z WebSocketResources: Sending a keepalive message
Actually, it's been asleep since yesterday. (Or at least, I didn't purposely wake it. Maybe I bumped in to the desk) But as far as I know : asleep since last night.
The first thing I did when I sat down was take care of Signal, and reply to your message.
Scott, you asked for detailed computer specs. Attached is the output of hwinfo (16,428 lines). apiziali_hwinfo.txt
Scott,
I attached a full hwinfo(8) output to this issue (#6103) on Github.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= On 2/1/23 10:32 AM, Scott Nonnenberg wrote:
@apiziali @MikeWallaceDev @timp34 If you could provide detailed computer specs, that would help us enter a bug on Electron. Our current theory is that we're not getting powerMonitor suspend/resume events https://www.electronjs.org/docs/latest/api/power-monitor as expected, which would allow us to better handle tasks started right before your computer goes to sleep. Today, they throw all sorts of errors because they take too long across your sleep/resume cycle.
-- Andy Piziali @.***>
This happens even on abbreviated suspends on my workstation.
Just chiming in to report that I'm observing the same issue (i.e., this is not a singularity). Debian Bookworm, xfce, signal-desktop. Happy to provide whatever logs you tell me to gather.
While it has been mentioned in conversation above, once signal-desktop is in this "state" it also cannot send messages anymore -- very reproducibly. So maybe there is some interesting log-gathering that can be done? As always ... Happy to provide whatever logs you tell me to gather.
@megele What we're looking for is a consistent repro so we can get it to happen, or the Electron team can get it to happen. When does it happen for you and when doesn't it?
Sorry for the very late reply. I appreciate your help in figuring this out! signal-desktop remains functional if the machine only briefly goes to sleep (e.g., if I just close the laptop lid, wait for the led-indicator to blink slowly, saying that it's asleep, and open it up after a few seconds again). However, whenever I step away from the machine for a longer period of time, the machine goes to sleep and when I wake it back up, the described issue occurs. TBH, I'm not sure what the difference is between these two situations.
In terms of "consistent repro", it is consistent in the sense that provided sufficient "sleep time" due to inactivity signal-desktop will always end up in the broken state. Unfortunately, I didn't figure out how to trigger the misbehavior on the push of a button. Happy to try suggested courses of action though.
Another strange observation I made, when the "synchronization issue" happens, terminating signal-desktop via File->Quit does not terminate any of the many processes making up the client. I have to kill them explicitly via kill to terminate them and then be able to restart signal-desktop and sync appropriately.
The recently closed bug report "[Linux: Signal doesn't reconnect after sleep or lock screen] #6027" looks quite similar to the issue I am seeing.
Bug Description
When I suspend and subsequently resume my Ubuntu 18.04 PC, the message queue of Signal Desktop no longer remains synchronized with my Signal Android. Every time I resume the PC, I need to restart Signal to see the latest messages already delivered to the mobile phone.
One other environment consideration is that this is a multi-user PC, with each of the two users logged in and running Signal. However, I believe I have seen the synchronization failure with only one active user session.
Steps to Reproduce
Actual Result:
Chats that were delivered to the mobile app are not delivered to the desktop application, as described in "Steps to Reproduce."
Expected Result:
When the workstation is resumed, Signal Desktop ought to reflect the chats that were delivered while the workstation was suspended.
Screenshots
Platform Info
Signal Version:
5.56.0 ... Production.
Operating System:
Ubuntu 18.04.6 LTS (uname -a : "Linux elijah 5.4.0-124-generic #140~18.04.1-Ubuntu SMP Fri Aug 5 11:43:34 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux")
Linked Device Version:
Samsung Galaxy A21, Android 11, One UI Core 3.1.
Link to Debug Log
https://debuglogs.org/desktop/5.56.0/33574877e6a4c8cdbd74fec3f7f09269a6c0281d6bb5a25e0a0db48626841da5.gz https://debuglogs.org/android/5.47.3/83a1078f72b089d5d0f5996e170a937b768ab035e7ab9456d2141516ce61d8a4
debuglog.txt