mzur / gnome-shell-wsmatrix

GNOME shell extension to arrange workspaces in a two-dimensional grid with workspace thumbnails
GNU General Public License v3.0
458 stars 58 forks source link

Performance regression over time on Ubuntu 21.10/GNOME 40 ? #190

Closed lissyx closed 2 years ago

lissyx commented 2 years ago

I have been a happy user of your extension for a long time now. I upgraded my laptop approximately two weeks before the release date of Ubuntu 21.10 and while things were smooth on 21.04, I have been suffering from a very painful issue. After a while (1-2 days ?), my laptop (ThinkPad P14s Gen2 AMD, 32GB RAM) is starting to become sluggish.

Overtime the issue become more and more visible, to the point that clicking, or typing on keyboard is problematic, with missing entries as well as keys repeating.

I have tried playing with many options within the extensions parameters, but nothing would help. At some point, keeping all my extensions active except this one is the only way for me to get back a working machine.

I am not doing anything fancy like suspend, I have 2 screens connected over DisplayPort each 2560x1440p, main laptop screen is disabled, and GNOME settings for energy is set so that both monitor will go blank after 5 mins.

I tried a perf record on gnome-shell when the issue was reproduced, but could not find anything actionable from the perf data file.

The Gnome session is using Wayland via the "Ubuntu" session, but the vanilla "Gnome" session over wayland repro the issue as well.

mzur commented 2 years ago

So you say this issue disappears if you disable this extension? What about the RAM usage when the problems start?

lissyx commented 2 years ago

So you say this issue disappears if you disable this extension? What about the RAM usage when the problems start?

Yes, it appears that the problem is not present anymore as soon as I disable it. I'm unsure about the RAM usage, but I was not hitting swap for sure and I think the amount of RAM consumed by gnome-shell when it reproduced the issue was close to the amount of RAM on normal run (right now, ~1.4% so 458MB).

mzur commented 2 years ago

Ok, thanks! And you have also tested the extension with all additional features (e.g. thumbnails, grid in overview) disabled for multiple days?

lissyx commented 2 years ago

I did disable everything, except maybe "grid in overview" which seemed to do nothing whether it was enabled or not?

lissyx commented 2 years ago

Also, it is difficult for me to switch enabling / disabling the extension, because it requires restarting the whole session ; I guess it is because of #179

mzur commented 2 years ago

Alright, thanks. Maybe we can find something based on the messages in the error log. Any additional information about this issue will be helpful, too.

lissyx commented 2 years ago

Alright, thanks. Maybe we can find something based on the messages in the error log. Any additional information about this issue will be helpful, too.

Unfortunately, logs were really unhelpful and even the few infos I shared earlier might be unrelated to this extension.

lissyx commented 2 years ago

After a full week with only this extension disabled, I can confirm I dont hit problem anymore. I am going to re-enable it with grid in overview disabled as well and see. @mzur Besides CPU and RAM, what should I look for to gather actionable feedback?

lissyx commented 2 years ago

Switching desktop with CTRL+ALT+(Left|Right|Up|Down) triggers this in the logs:

Nov 2 10:41:48 portable-alex gnome-shell[2646]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined#012_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18 Nov 2 10:41:48 portable-alex gnome-shell[2646]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined#012_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18 Nov 2 10:41:49 portable-alex gnome-shell[2646]: Source ID 6130208 was not found when attempting to remove it Nov 2 10:41:50 portable-alex gnome-shell[2646]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined#012_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18 Nov 2 10:41:50 portable-alex gnome-shell[2646]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined#012_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18 Nov 2 10:41:51 portable-alex gnome-shell[2646]: Source ID 6130455 was not found when attempting to remove it

lissyx commented 2 years ago

Capture d’écran de 2021-11-02 17-03-34 After ~ six hours, steady 15% with peaks at 25% of CPU. Memory usage grow from ~1.4% when I posted this morning to this 1.8%, i.e., from ~458MB to ~589MB.

lissyx commented 2 years ago

Captured perf on gnome-shell process while displaying htop and going on one workspace on the left, then coming back, then going down one workspace, then coming back.

$ perf report | head -n 20
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 7K of event 'cycles'
# Event count (approx.): 5456341682
#
# Overhead  Command          Shared Object                    Symbol
# ........  ...............  ...............................  ..............................................................................................
#
     8.05%  gnome-shell      libgjs.so.0.0.0                  [.] 0x00000000000507ec
     3.14%  gnome-shell      libwayland-server.so.0.1.0       [.] wl_display_flush_clients
     2.49%  gnome-shell      libglib-2.0.so.0.6800.4          [.] g_main_context_check
     2.47%  gnome-shell      libmutter-clutter-8.so.0.0.0     [.] clutter_actor_has_mapped_clones
     2.31%  gnome-shell      libmozjs-78.so.78.13.0           [.] 0x000000000054b8e1
     2.20%  gnome-shell      libmutter-clutter-8.so.0.0.0     [.] 0x00000000000507ce
     2.13%  gnome-shell      libmozjs-78.so.78.13.0           [.] 0x0000000000555eb4
     1.12%  gnome-shell      libglib-2.0.so.0.6800.4          [.] g_main_context_prepare
     1.11%  gnome-shell      libc.so.6                        [.] __memmove_avx_unaligned_erms

@mzur I can share the perf.data file if it can be useful to you.

lissyx commented 2 years ago

~24h after the change, my system is feeling sluggish, GNOME Shell consuming still ~15-20% of CPU but RAM hsa increased to 3.4%, i.e., 1114MB. There's definitively something leaking.

lissyx commented 2 years ago
$ perf report | head -n 20
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 6K of event 'cycles'
# Event count (approx.): 5532111803
#
# Overhead  Command          Shared Object                    Symbol
# ........  ...............  ...............................  ..............................................................................................
#
    12.41%  gnome-shell      libgjs.so.0.0.0                  [.] 0x00000000000507ec
     7.06%  gnome-shell      libmozjs-78.so.78.13.0           [.] 0x000000000054b8e1
     5.85%  gnome-shell      libmozjs-78.so.78.13.0           [.] 0x0000000000555eb4
     3.29%  gnome-shell      libmutter-clutter-8.so.0.0.0     [.] clutter_actor_has_mapped_clones
     2.24%  gnome-shell      libwayland-server.so.0.1.0       [.] wl_display_flush_clients
     1.85%  gnome-shell      libglib-2.0.so.0.6800.4          [.] g_main_context_check
     1.42%  gnome-shell      libgjs.so.0.0.0                  [.] 0x000000000004ddbf
     1.27%  gnome-shell      libgjs.so.0.0.0                  [.] 0x000000000005a5b8
     1.19%  gnome-shell      libmozjs-78.so.78.13.0           [.] 0x000000000054b8cc
ebeem commented 2 years ago

Surely there's something wrong, but I can't produce this myself. My gnome-shell is running for days and the RAM usage is ~250MB (my ram is 64GB). the CPU usage us ~0.20 if it's idle and ~1.20% if it's doing animations like switching workspaces or displaying the overview. All of the features are enabled.

mzur commented 2 years ago

Thanks for all the info @lissyx! I think I observed the Source ID 774295 was not found when attempting to remove it errors somewhere, too. Any more info is greatly appreciated until someone can have a closer look. Also feel free to tinker a bit with the code yourself. Maybe disable some parts and see if the issue persists so we can narrow down the cause.

lissyx commented 2 years ago

Thanks for all the info @lissyx! I think I observed the Source ID 774295 was not found when attempting to remove it errors somewhere, too. Any more info is greatly appreciated until someone can have a closer look. Also feel free to tinker a bit with the code yourself. Maybe disable some parts and see if the issue persists so we can narrow down the cause.

Unfortunately, I don't really have time to dig into GNOME extensions and hack them :/

mzur commented 2 years ago

Welcome to the club 😄 it may take a while until someone else has the time to look into this.

lissyx commented 2 years ago

Welcome to the club smile it may take a while until someone else has the time to look into this.

If someone can share instructions on how to install manually a debug-enabled extensions and add tracing / debugging to one (assuming, bug aside, it would still be usable for daily work), I'd be happy to install it and share more data.

mzur commented 2 years ago

You can find developing instructions in the readme. Basically you have to clone the repository to the right location and you can start hacking "live". The only thing I found useful for debugging is the log() function. This outputs stuff right into the log that you can open with journalctl -f /usr/bin/gnome-shell.

ebeem commented 2 years ago

My PC keeps running for days and sometimes weeks with no restarts, I sometimes suspend it, but I don't think this is a reason why I am not seeing this performance issue. Do you always see the errors below when you switch workspaces?

    Nov 2 10:41:48 portable-alex gnome-shell[2646]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined#012_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18
    Nov 2 10:41:48 portable-alex gnome-shell[2646]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined#012_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18
    Nov 2 10:41:49 portable-alex gnome-shell[2646]: Source ID 6130208 was not found when attempting to remove it
    Nov 2 10:41:50 portable-alex gnome-shell[2646]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined#012_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18
    Nov 2 10:41:50 portable-alex gnome-shell[2646]: JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined#012_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18
    Nov 2 10:41:51 portable-alex gnome-shell[2646]: Source ID 6130455 was not found when attempting to remove it

It seems to me that this is a gnome version issue. So if you always see this issue, we will just have to debug the extension under the same software versions you have. I will try to download Ubuntu 21.10 and check it out, but please let me know whether you always see this error in the logs or not.

lissyx commented 2 years ago

Unfortunately, to date, those were the only ones I could see in any log. One should note that I do see some of JS ERROR: Error: Expected an object of type ClutterActor for argument 'sibling' but got type undefined#012_syncStacking@resource:///org/gnome/shell/ui/workspaceAnimation.js:80:18 occurrences even without the extensions, but:

I'd be happy to try and test patches for more debugging if it can help rule out an issue on the extension side and rather point to something more upstream.

ebeem commented 2 years ago

I looked more into this and it seems like an upstream issue as you stated, so it will probably happen even without the extension. I tried overwriting the function _syncStacking

            _syncStacking() {
                const windowActors = global.get_window_actors().filter(w =>
                    this._shouldShowWindow(w.meta_window));

                let lastRecord;

                for (const windowActor of windowActors) {
                    const record = this._windowRecords.find(r => r.windowActor === windowActor);

                    if (record && lastRecord) {
                        this.set_child_above_sibling(record.clone, lastRecord ? lastRecord.clone : this._background);
                        lastRecord = record;
                    }
                }
            },

I am creating a new branch to test what could go wrong by overriding this function as it seems like it has side effects after waking up from sleep, if you would like to help me with the testing, please checkout this branch and let me know if it improves the performance in your case, whether it fixes the JS error, and if it has any side effects. issue_190

ebeem commented 2 years ago

source id errors were fixed in master, please check and let us know if you still face this performance regression.

lissyx commented 2 years ago

source id errors were fixed in master, please check and let us know if you still face this performance regression.

Is it the one that is currently available from extensions.gnome.org? I've updated to it this morning, but after a few hours, I feel like the system is again exhibiting the issue.

Version in metadata.json on my local system is 33.

(sorry, because of the current status of the pandemic, i have not been able to investigate that problem further).

mzur commented 2 years ago

Is it the one that is currently available from extensions.gnome.org?

Yes, the most current version there includes the source id fix.

mzur commented 2 years ago

@lissyx If you happen to upgrade to GNOME 42, please check if this issue still occurs.

lissyx commented 2 years ago

Thanks, I upgraded to Ubuntu 22.04 around early march and I was looking forward giving a new spin to this extension, but I saw you were waiting for the final release before. I'll keep you updated.

lissyx commented 2 years ago

Reinstalled a few minutes ago, enabled with those settings:

lissyx commented 2 years ago

More than 24h later, no noticeable performance issue. I'll keep the issue open a few more days and let you know if it ever happens. If by the end of the week it's still good, I think we could assume it was fixed.

lissyx commented 2 years ago

Unfortunately, I had a few reboots in the meantime (because of unrelated WWAN firmware crashes), but I think the best way to consider that this bug is likely fixed is that I forgot about it over the past days.

RESOLVED:WORKSFORME

lissyx commented 2 years ago

Six days later, absolutely no symptom, so I can safely confirm it's fixed.

lissyx commented 1 year ago

It seems to have reappeared with newer versions of GNOME on Ubuntu 22.04. it might have been there since the beginning of the release, but my laptop was facing other stability issues that made sessions not living long enough. This morning I rebooted after more than 10 days on my gnome-shell session and no issue ; that was with this extension disabled. At that moment, gnome-shell was consuming ~1.5% of RAM according to htop.

As I am writing this comment, it's already 2.4% of RAM consummed after I enabled the extension this morning. Capture d’écran du 2022-08-19 15-38-22