mzur / gnome-shell-wsmatrix

GNOME shell extension to arrange workspaces in a two-dimensional grid with workspace thumbnails
GNU General Public License v3.0
469 stars 60 forks source link

Don't leak WorkspaceThumbnail objects #291

Closed pobrn closed 2 weeks ago

pobrn commented 2 weeks ago

From the main commit:

WorkspaceThumbnail objects must be explicitly destroy()ed, otherwise
they will not be garbage collected. Furthermore, each such object
subscribes to the "window-{left,entered}-monitor" signal of the
particular MetaDisplay. Thus leaking these thumbnails causes those
signals to have a very large number of subscribers after the popup
has been shown a sufficient number of times. This shows up in profiling,
and also causes stuttering when a window is moved between monitors or created.

For example, with 4*4=16 workspaces on 3 monitors, every time the popup
is shown 48 new subscribers are added to both signals. After a couple
days of uptime, there may be thousands.

Fix that by destroying the WorkspaceThumbnail objects in `_onDestroy()`.
pobrn commented 2 weeks ago

I used sysprof, and that showed one promising candidate for the offending call stack.

Flame graph from sysprof

In one particular recording about 13.5% of the sampled call stacks were contained the g_signal_emit_by_name() call inside meta_window_update_monitor(). And since moving windows between monitors is where I experienced stuttering, I thought it should be something there.

Another sign was that gnome-shell was actually running, it was scheduled on a CPU:

Profiler marks and CPU

You can see that Meta::Later::invoke() takes too long, ~175 ms above. That is the part under on_before_update.lto_priv.0 in a particular invocation of the call stack from earlier.

From the call stack, I was quite certain that the issue must be with the window-{left,entered}-monitor signals. I am not exactly sure how, but I did realize somehow that showing the workspace switcher popup many times will cause the stuttering when moving windows between monitors. So I first tested with this extension disabled, I could not reproduce the issue.

I also modified gnome-shell to print log messages in {Workspace,WorkspaceThumbnail}::_windowEnteredMonitor(). And from that it seemed that WorkspaceThumbnail objects were leaked or similar. Then I narrowed it down to the part that shows the workspace popup as when show-popup == false, the issue disappeared.

Then I started using FinalizationRegistry to check what is getting leaked, and it became clear that the objects in WorkspaceSwitcherPopup::_items (the WorkspaceThumbnails) do not get finalized. Looking at gnome-shell revealed that they individually .destroy() every WorkspaceThumbnail after they are done using them. So I did the same here, and it appears to have solved the issue.

mzur commented 2 weeks ago

Thanks a lot for the writeup! This will help debugging similar issues in the future.