Closed Kabouik closed 2 years ago
I'm observing the same behavior on Fedora 29. Attached a screenshot as well.
Desktop client version is 4.2.1 Happens across all server versions.
Ticket for reference: https://mattermost.atlassian.net/browse/MM-16626.
This is still happening with 4.2.1 on Windows and 4.2.3 on Linux.
Confirmed, Linux 4.2.3 constantly hogs ~4% of CPU and making laptop fans spin, which means something incorrect inside Mattermost is running in a constant polling loop of some kind.
Feels like a regression from 4.2.0, because I'm monitoring for this type of stuff pretty actively, and don't think I noticed that earlier.
Slack v4 client seems to have made massive improvement strides with this, as it now seems to be nearly idle when sitting in the background.
Would love to get that kind of performance out of Mattermost client as well.
I'm wondering if this is not an issue at all with the desktop application but rather with the webapp, which should be very encouraging.
I cloned the master branch of desktop
and observed upon launch that it will idle nicely at 0.0
CPU usage. I add server web URL: https://chat.redox-os.org
. Only upon logging in to server do I see a jump in CPU usage, idling between 6% to 8%.
By comparison, I open a single tab in Chrome, idling at 0.0
CPU usage. I navigate to https://chat.redox-os.org
. Upon logging in I see a jump in Google Chrome process, idling between 6% to 8%.
I'm testing on macOS Mojave 10.14.6. I intend to also test on Windows and Linux, when I get a chance...
Next step is to dive into webapp codebase. In particular I want to measure websocket packets to look at the size and what kind of IO work is being performed on reception of packet.
@hunterlester I'm currently running 5.13.1 webapp in Firefox 69 as daily driver, and it does not show similar CPU hogging symptoms. But I'll continue to keep an eye on this.
@lkraav For the firefox
process what is it's rough idling CPU usage when signed into server?
I also see around 6% CPU usage in webapp on Firefox once logged into server, however, interestingly the one difference is that when the Firefox window is hidden, CPU usage does return to 0.0
, unlike Chrome.
Thanks for digging in that direction. I also failed to reproduce it with the webapp, but as mentioned in the first post, I did not thoroughly test the webapp. I edited the first message to reflect this.
Your observation of a 6% CPU use seems quite good though compared to the more than 60% (or even more than 80% sometimes¹) in my screenshot (on an i5 7200U CPU), albeit already high for a chat application.
¹ : It is also interesting to note that this is quite variable, from low CPU usage to huge spikes, or even long periods of high CPU usage, all without noticeable differences in the content of the channels.
Quick update, I've just tested in the same manner on Ubuntu 18.04.3.
@Kabouik What is your CPU usage before logging into server and then after logging into server? So far I observe it consistently increases upon logging in and accessing webapp, whether through Electron or Firefox, or Chrome.
I'll be testing on WIndows as well, hopefully tonight...
Just tested on Windows and observing same behavior as my tests on Linux and macOS. Of course I'm missing something. Still think it's a server/webapp issue and not specific to this desktop app.
I happen to be logging into https://chat.redox-os.org
, which is not a very active server.
My next tests will be while connected to the Mattermost community server.
To help narrow down this issue I'll be logging web socket events as well as packet payloads in the webapp. I also want to find out if there's any additional data being sent over the server-to-client heartbeat.
Confirmed, my Linux client is definitely a CPU hogger
Nothing visibly exciting going on in the active chat window, just some text and static images pasted.
Update: Not enough web socket events to account for the consistent CPU usage.
Next is tracing sys calls. I'll need help understanding as this is new ground for me.
System rebooted with csrutil disabled
.
Running dtruss -s -ap <Electron PID>
for 60 seconds.
CALL COUNT
__disable_threadsignal 2
bsdthread_create 2
bsdthread_terminate 2
getrlimit 2
proc_info 2
thread_selfid 2
mprotect 7
madvise 38
psynch_mutexdrop 105
psynch_mutexwait 105
ulock_wait 158
ulock_wake 158
psynch_cvsignal 2439
psynch_cvwait 2441
read 3313
recvmsg 7701
write 15842
kevent 33923
Will read about kevent
, to find out what effect it has on CPU.
The write
events I would think are significant.
Simultaneously running iosnoop -p <Electron ID>
but did not see IO activity corresponding to the number of read and write sys calls.
I only observed 11 IO events performed by Electron.
I wonder if it’s because the IO events are coming from the web app, which is a sandboxed environment and must not directly write or read from files but indirectly through memory.
Example write
thread stack:
1768/0x212e: 8631296 28 26 write(0x19, "\030\0", 0x28C8) = 8192 0
libsystem_kernel.dylib`write+0xa
Electron Framework`0x0000000104830a00+0x40f
Electron Framework`0x0000000104830450+0xc7
Electron Framework`0x0000000104830220+0x2c
Electron Framework`0x0000000105aecd20+0xd6
Electron Framework`0x0000000105af7220+0x4ac
Electron Framework`0x0000000105aecf50+0x7f
Electron Framework`0x0000000105a6b550+0x1a8
Electron Framework`0x0000000105ab0640+0x1aa
Electron Framework`0x0000000105aed410+0x57
libsystem_pthread.dylib`_pthread_body+0x7e
libsystem_pthread.dylib`_pthread_start+0x42
libsystem_pthread.dylib`thread_start+0xd
By comparison, a single tab open in Firefox, accessing same Mattermost server, running dtruss -s -ap <Firfox PID>
, also for 60 seconds.
Note: CPU usage between Firefox process and Electron process was comparable.
CALL COUNT
fsync 1
getegid 1
geteuid 1
getgid 1
getsockname 1
getsockopt 1
kqueue 1
mkdir 1
unlink 1
ftruncate 2
rename 2
sendmsg_nocancel 2
socketpair 2
sysctl 2
__disable_threadsignal 3
bsdthread_create 3
bsdthread_terminate 3
getpeername 4
statfs64 4
close_nocancel 5
lseek 6
shm_open 6
connect 7
fstat64 7
socket 7
munmap 8
psynch_cvclrprepost 8
sendto_nocancel 8
access 10
proc_info 10
select_nocancel 10
getattrlist 11
getuid 11
kevent_qos 12
necp_client_action 12
thread_selfid 12
gettimeofday 13
poll 13
madvise 16
recvfrom_nocancel 22
open 27
pread 28
setsockopt 30
mprotect 32
close 42
sendto 42
select 47
psynch_mutexdrop 51
psynch_mutexwait 52
fcntl 65
stat64 67
mmap 71
recvfrom 150
kevent_id 1289
workq_kernreturn 1313
recvmsg 2469
psynch_cvsignal 3868
sendmsg 4926
read 4948
write 4956
psynch_cvbroad 6074
kevent 6155
psynch_cvwait 19797
Saw many errored kevent
logs due to invalid address.
Next steps are to research sys calls to understand effect on CPU.
Which files are being written to and what is being written.
iosnoop
doesn't give me insight, so need to dig into Dtrace toolbox
Next step is to look up those Electron Framework'0x...
addresses, with the goal of determining the functions which are being called, resulting in continuous write
and kevent
syscalls, which appear to be correlated with consistent CPU usage.
Logging thread stacks using dtrace -p <Electron Helper PID> -n 'syscall::write:entry {ustack()}'. Also looking at
syscall::kevent:entry`.
Note: It's not my intention to spam this thread. I'm documenting the process so that more experienced engineers can point out gaps in my knowledge and secondly for others to be able to learn if we're able to gain insight.
Thinking that ustack
isn't able to resolve Electron Framework
function symbol names as those may be defined and allocated at runtime. Given that Electron(Chromium) is based on V8, perhaps those functions don't exist at the time of process creation, so they won't be a part of the process's symbol table. So those addresses may be referencing runtime, heap allocated functions.
Think it's confirmed when taking core dump of the process and bt all
in lldb
:
frame #0: 0x00007fff68bed78e libsystem_kernel.dylib`kevent + 10
frame mattermost/desktop#1: 0x000000010e7fa969 Electron Framework`___lldb_unnamed_symbol124545$$Electron Framework + 89
frame mattermost/desktop#2: 0x000000010e7f845d Electron Framework`___lldb_unnamed_symbol124506$$Electron Framework + 573
frame mattermost/desktop#3: 0x000000010e7ee117 Electron Framework`___lldb_unnamed_symbol124239$$Electron Framework + 455
frame mattermost/desktop#4: 0x000000010e76c6f8 Electron Framework`___lldb_unnamed_symbol122108$$Electron Framework + 424
frame mattermost/desktop#5: 0x000000010e7b17ea Electron Framework`___lldb_unnamed_symbol123124$$Electron Framework + 426
frame mattermost/desktop#6: 0x000000010e7ee467 Electron Framework`___lldb_unnamed_symbol124250$$Electron Framework + 87
frame mattermost/desktop#7: 0x00007fff68ca62eb libsystem_pthread.dylib`_pthread_body + 126
frame mattermost/desktop#8: 0x00007fff68ca9249 libsystem_pthread.dylib`_pthread_start + 66
frame mattermost/desktop#9: 0x00007fff68ca540d libsystem_pthread.dylib`thread_start + 13
Or using lldb
, nothing returns when running image lookup --address <hex memory address>
.
How do I lookup symbol names for dynamically allocated functions? Thinking that I need to obtain a V8 heap dump then figure out how to resolve the core dump memory addresses.
UPDATE
Those continuous kevent
and write
syscalls are not contributing to sustained high CPU usage. They only account for 0.5% of CPU usage and are called at the same rate, whether or not logged into chat server.
Back to the beginning to look for more clues.
Hi, I have the same issue : the desktop app with 3 servers eats 15% of my i7 CPU doing nothing. It happens ONLY when the main window is opened and decreases to 1% when closed but with the notification icon in the taskbar (app running). As I have no change enabling GPU acceleration or not, I just suspect a pb / bug with the display, maybe related to Electron.
I've just been using the MM web client in Firefox for the past few months to work around this issue.
I'm running on a Pi3 and in normal use is absolutely fine. However - as soon as I attempt to upload a jpg image > 1mb all 4 cores peg at 100% and the interface shows "processing". Tried by web and windows client so far. Smaller images are fine.
ok. Just ran a 930k image in and it took about 3 minutes to finish processing but it did eventually. Unsure what it's actually doing when "processing" but it's clearly resource intensive. Assuming it's memory as that is c. 1G on this machine.
Using MM Desktop 4.3.2 on Linux (Ubuntu 19.10) is constantly using significant load. Before realising this I replaced my laptop battery, now I know!
I'm looking forward to testing the freshly released 4.4.0-rc0 for this. Still using webclient on Firefox for now.
Well, it's an improvement, I think, but I'm still going back to the browser client.
4.4.0-rc1 still continously hogs 6% CPU on Linux. Kills battery life, so AC-only.
@lkraav @artfulrobot Hi all, I'm not certain but I may have found one of the performance issues, however I can't rely on my own measurements. Really curious to know if you see any difference in these artifacts: https://circleci.com/gh/mattermost/desktop/5442#artifacts/containers/0
Base on this PR, in case you want to verify what you're installing.
Something else I want to confirm is what percentage of CPU usage the web app accounts for. What is the extra CPU usage in your Chrome/Firefox/etc process(es) when running MM?
That's an important comparison that's consistently missing in our measurements.
Really curious to know if you see any difference in these artifacts: https://circleci.com/gh/mattermost/desktop/5442#artifacts/containers/0
Unfortunately there doesn't seem to be a difference.
Something else I want to confirm is what percentage of CPU usage the web app accounts for. What is the extra CPU usage in your Chrome/Firefox/etc process(es) when running MM?
My observation has always been that webapp doesn't suffer from this cycle churn. Firefox has about:performance
monitor available, MM tab most of the time sits completely idle. Perhaps there are more in-depth in-browser cycle monitoring tools available, not sure what they are, for now.
@lkraav Thank you for taking the time. I really appreciate it. I'm also noticing next to no CPU usage in Electron dev console CPU profiler.
So, my next line of investigation is through V8 profiler.
Passing --js-flags="--prof"
when executing electron to generate isolate CPU tick log files.
Then processing with node --prof-process
.
Hopefully leads to good clues:
ticks total nonlib name
23 0.1% 1.2% JavaScript
1828 5.1% 91.6% C++
65 0.2% 3.3% GC
33626 94.4% Shared libraries
144 0.4% Unaccounted
where most of CPU time is taken by /usr/lib/system/libsystem_kernel.dylib
, according to bottom up profile.
I thought this was a dead end before when I used dtrace to observe constant write and kevent sys calls, which didn't appear to be utilizing any significant amount of CPU.
I could use more expertise here.
Leaving a trail so that a bunch of us can investigate.
Thanks for the instructions. There are many isolate log files. How can I tell which might be relevant?
Ah yes, I'm passing --no-logfile-per-isolate
as well as --prof
to generate single file.
I'm not certain but it seems like the number of isolates isn't insightful as far as Electron processes are concerned.
Doc on isolate class: https://v8docs.nodesource.com/node-0.8/d5/dda/classv8_1_1_isolate.html
Without that flag I was receiving 4 isolate log files and they each had similar profile summary.
Here's my single processed output: https://gist.github.com/hunterlester/0c387bccafa02ca1229be34bec24703e
We noticed that CPU usage is often caused by animated gifs placed in chats.
We noticed that CPU usage is often caused by animated gifs placed in chats.
Correct. This issue is working on a more general problem.
We noticed that CPU usage is often caused by animated gifs placed in chats.
Not only. mattermost/mattermost#24689 is dedicated to CPU use caused by animated gifs, but I posted there results showing that there are CPU spikes even with channels containing no gifs, sometimes actually higher than channels with gifs being displayed in the same teams. This was one of the main reasons why I opened this issue: there are additional causes for (huge) CPU spikes that haven't been identified yet.
This was one of the main reasons why I opened this issue: there are additional causes for (huge) CPU spikes that haven't been identified yet.
CPU spikes vs low but constant CPU churn are also two different issues :thinking:
Title "CPU use" is rather generic for now..
CPU spikes vs low but constant CPU churn are also two different issues
You are correct, bad wording in my previous message. I actually experience both CPU spikes and unusually high base CPU use (and others too, right?).
I also experience constant 13-15% CPU usage by "Mattermost Helper (Renderer)" on macOS (MBP 2019) with the desktop client (4.4.0.5682). An additional 2% is constantly used by "Mattermost Helper (GPU)" while idle, even when the window is closed but the client is still running the CPU usage stays the same, which is quite counter-intuitive.
Deactivating GPU rendering reduces CPU usage by the Renderer process by about 4%.
Interestingly, the web client on Firefox (74.0, 64-bit) makes Firefox's CPU usage jump up by about 20% while the tab is opened. Switching to another tab returns the CPU usage to the average 5-7%, which is normal in my machine.
Mattermost server version: 5.20.1 Team Edition
Hey folks, thanks for all your reports. Another very important info for all reports here is the Mattermost server version. If you don't know how to get it, it's shown in the About Mattermost tab in your account settings (Hamburger Menu next to your username)
A lot of the performance impacts obviously come from the webapp rendering, so having the server version helps us know where to look for. Sometimes there's already been some improvements in newer version so it's also worth looking into upgrading to a newer version (latest ESR/LTS is 5.19 and latest stable quality release is 5.21)
I got very bad CPU use with the client lately so I tried using Mattermost in a browser instead (not my main browser because I want a separate window). I tried Epiphany (webkit). I still get 50% CPU use on an i5 7th generation CPU, so it's not tied to the Electron application.
Server is 5.7.0, I admit I'm a bit late on that. I have no time to backup and upgrade right now but I will report back once it is done in case there is a significant change in CPU use.
Server is 5.7.0, I admit I'm a bit late on that. I have no time to backup and upgrade right now but I will report back once it is done in case there is a significant change in CPU use.
I have upgrade 5.16 -> 5.21 and there's no difference in desktop client CPU usage.
BUT I did discover something else today: CPU churn apparently radically depends on the size of the channel history
I opened htop
(userland threads enabled), filtered for mattermost
process names, about 6-7 processes remained on screen
CPU churn all but vanished if I switch to a new / empty or a channel with very few messages (20-30+ total)
CPU churn resumed instantly when I switched to regular work traffic level channels, history of hundreds / thousands messages - it could also be something with images, threads, or whatever high volume channels have more of
That is something I investigated some months ago but I cannot confirm, on my end Mattermost can use 60% CPU even when showing a private channel with less than 10 messages in history.
@lkraav Great finding! I see the exact same thing -- a fresh, empty, channel reduces CPU usage to a perfectly acceptable minimum (<1% CPU in my case).
This has been observed on the browser front-end as well. Moderate CPU usage for chats with long history. Very little CPU usage on chats without history. Reproduced using Chrome.
My main Windows user is reporting that the issue arises mainly when images are posted in a channel, and just a few images are enough to get to 30-50% of CPU use. He does not know whether image URLs previewed in the channels and images uploaded to Mattermost directly cause the issue equally, we haven't checked.
I don't think the link is that obvious in Linux: as stated previously, I often observed even higher CPU use in channels with no visible images (do other channels not being displayed and past history use CPU too?). However, it's worth investigating in that direction because he is positive about the relationship, since he has to kill the Mattermost process every time I post 3 or 4 images.
Of note, there are no CPU issues whatsoever on mobile applications. Android works exactly the same whether images are being previewed or not. I use a third-party SailfishOS application, and no CPU issue either.
Getting the same issue on both Windows (7 pro 64, 7 ultimate 64, 10 home) and Linux (Ubuntu 18, 19, and 20). Mattermost desktop app ( or -snap app) causing 40-70% CPU load when idle, when hidden, when no messages are coming in in any of the channels. Slightly more load during actual activity. Edit: server version 5.9.0 for the one I just checked. Edit2: no animated gifs in any of the channels. still >50% CPU load.
CPU spikes vs low but constant CPU churn are also two different issues thinking or high constant CPU churn.
My main Windows user is reporting that the issue arises mainly when images are posted in a channel, and just a few images are enough to get to 30-50% of CPU use.
Note that he added that disabling URL preview and extended image preview in Account settings
> Display
does not help.
@lkraav Great finding! I see the exact same thing -- a fresh, empty, channel reduces CPU usage to a perfectly acceptable minimum (<1% CPU in my case).
Same here, just switching to an empty channel makes Mattermost disappear from top
. Mattermost desktop v4.5.0.6629 on macOS 10.15.5.
Hi, I just noticed that the corresponding jira ticket https://mattermost.atlassian.net/browse/MM-16626 was closed. I know this is difficult to trace down and debug, but shouldn't the issue be kept open?
Thanks, re-opened,
It's funny, but I think I found the problem of idle cpu using. It's the CSS animation of the ".loading-screen .loading__content .round" elements. It's a spinner animation with 3 circles at the top of a channel (used for the chat infinity scroll). It is an infinity animation at an invisible area, but browser render it all the time. It's easy to check. After a loading any chat with a big list of messages we see 4-5% of CPU usage for the Render process. But if you remove the element(display:none) or just disable the animation of the .round
elements the CPU of the process drops to 0.1-0.3% (be careful - a Developer console can cause CPU usage that can spoil the test). So the best it's to hide the element if we don't need it. I've made small few experiments, but it seems that the animation or the element must be removed (or hidded by display:none) to solve the problem.
I can confirm that setting display: none
on the element matching the selector div.loading-screen
greatly helps with the CPU usage of the Mattermost 4.5.0.6629 client on macOS.
@Regul777 you deserve an award of some kind :rocket: :partying_face:
I can confirm @frerich experience: CPU use instantly drops to 0.
I'm not sure when 4.6.0 is planned to be released, but if this hotfix was patched up into 4.5.~2~3 asap, I think nature would sincerely appreciate it.
I've created a jira bug after some investigations. I can confirm this behaviour both in desktop and firefox.
Please, be aware that when this change is merged it will require a server upgrade, not a desktop upgrade as it is not related to desktop source code.
Also, once it is merged I'll keep the issue open, as I'm sure there will be other factors taking up cpu usage
I confirm (by marking "x" in the [ ] below: [x]):
Summary The CPU use of the Mattermost desktop client seems way higher than it should on Linux and Windows (also reported on Mac in mattermost/mattermost#24689, and Windows and Linux in the said issue by myself with screenshots). Another example:
The screenshot was taken just after starting Mattermost, but the same statistics can persist hours later.
I have been having this issue with the Mattermost Linux client version 4.1.2 as well as previous versions in the past, and so did one of my users who uses the Windows client. I have posted in mattermost/mattermost#24689 to give details on our observations on both platforms.
However, issue mattermost/mattermost#24689 seems to be specific to Mac and has been related to animated gifs on that case, so it may be a different problem. There are several other issues on CPU use here, but most have been closed due to inactivity rather than real solutions. Therefore, I believe it is better to open another issue to show that the problem happens on other platforms than Mac, regardless of animated gifs and HTML previews.
Environment
Steps to reproduce Hard to tell: Mattermost does not always show high CPU use like on the screenshot above, but it is fairly common (several times a week) and will last even after restarting the application. I failed to identify anything in the conversations that correlates with the CPU use. Not sure what stops it either. I initially thought it would be animated gifs or HTML previews after reading mattermost/mattermost#24689, but I observed that there was no relation.
Expected behavior Lower CPU use.
Observed behavior Sometimes, all the CPU resources available are used.