brndnmtthws / conky

Light-weight system monitor for X, Wayland (sort of), and other things, too
https://conky.cc
GNU General Public License v3.0
7.19k stars 616 forks source link

[Bug]: Memory issues when changing cursor theme #1771

Closed Penaz91 closed 6 months ago

Penaz91 commented 7 months ago

What happened?

This has started when I noticed my X session freezing randomly without any reason and OpenBox consuming 10GB of RAM.

While investigating the issue, I found a way to reproduce it by using LXAppearance to change the mouse cursor theme.

After further examination, I narrowed down the issue to Conky (if I don't have it running, there is no freeze).

Using my own conkyrc or the default one makes no difference: Xorg stops working, windows cannot be moved and OpenBox quickly starts eating GB of ram.

Sadly I can get no stack trace because there is no crash.

Version

1.19.7

Which OS/distro are you seeing the problem on?

Arch Linux

Conky config

Not applicable. The issue arises with the example Conky config.

Stack trace

Not applicable, since there is no crash.

Relevant log output

No response

brndnmtthws commented 7 months ago

Can you try 1.19.8? This sounds related to #1748.

Penaz91 commented 7 months ago

I tried with a version compiled from the main branch (I think it was 1.20.0_pre?) and the issue persists.

Also it seems I made a mistake (probably distraction): my distro is Artix Linux (ArchLinux derivative without SystemD). Apologies.

Penaz91 commented 7 months ago

I decided to take the plunge and do a preliminary session of git bisect. It seems that this is the commit that introduced the bug: aa6e61be2ecbc67ee3d97f6403e08723bf5f1b99

Here is the bisect log:

# bad: [998f7ff14d52a70aa31dfdfb66753d098cabc0be] Bump web deps, fix lint
# good: [bbdc7081aec27daafca07fc40523335a2ea0a992] Preserve ordering of generated Lua code (#1646)
git bisect start 'HEAD' 'bbdc7081'
# bad: [ac9d107e77cd62199a285f284fb5bd866cb4cf62] Remove build date and associated vars
git bisect bad ac9d107e77cd62199a285f284fb5bd866cb4cf62
# good: [a3fc61b0782800a42ff0829ec3cda65b50431bd7] Cleanup focus handling code on propagation
git bisect good a3fc61b0782800a42ff0829ec3cda65b50431bd7
# good: [3ff7095d8296764ae43149eaddc8c54732eb26f3] build(deps): bump DeterminateSystems/nix-installer-action from 6 to 7
git bisect good 3ff7095d8296764ae43149eaddc8c54732eb26f3
# good: [5e6db91b6d4608865d84cc83b16cad7bd7c8884c] build(deps): bump github/codeql-action from 2 to 3
git bisect good 5e6db91b6d4608865d84cc83b16cad7bd7c8884c
# good: [9ab5c1d099745b1e9b2b2417c13b8525af00bd19] build(deps): bump DeterminateSystems/magic-nix-cache-action from 2 to 3
git bisect good 9ab5c1d099745b1e9b2b2417c13b8525af00bd19
# bad: [a1ab393318794d0e78aa89aeeee484c8c81b1ade] Bump version
git bisect bad a1ab393318794d0e78aa89aeeee484c8c81b1ade
# bad: [aa6e61be2ecbc67ee3d97f6403e08723bf5f1b99] X11: Fix infinite loop from Expose events being returned
git bisect bad aa6e61be2ecbc67ee3d97f6403e08723bf5f1b99
# first bad commit: [aa6e61be2ecbc67ee3d97f6403e08723bf5f1b99] X11: Fix infinite loop from Expose events being returned

I'm not sure about my bisecting, since there were a couple of commits that didn't show the example configuration (only a small square). But this may be a starting point, I'll do a more thorough bisect (by skipping such commits) asap.

Penaz91 commented 6 months ago

I have done a bit more experimentation and found out that this seems to be an interaction specifically between OpenBox and Conky.

I tested IceWM and it seems to work well (there are no freezes with my "change the mouse cursor" test), I have also done another bisect that brought me to a commit where the CMake file was changed, which doesn't really make sense.

At the moment a WM change would be too much for me to undertake, but I think this may be the wrong place to file a bug, considering OpenBox has been unmaintained for years.

I'll leave the decision on whether this issue should be closed to you. Thank you.

Penaz91 commented 6 months ago

Can you try 1.19.8? This sounds related to #1748.

Forgot to say: I tested this and the issue persists.

I did another couple of bisects and they both pointed at commit aa6e61b, reverting it doesn't allow anything to be rendered (but a small black square) but doesn't freeze my Xorg session.

brndnmtthws commented 6 months ago

@Suyooo @Caellian do either of you have any ideas on this issue?

Suyooo commented 6 months ago

No ideas about what could cause this, aa6e61b should have just restored propagation behaviour to what it was before.

Did you mark the commits with just the square rendering as good for your second bisect? In that case, the only idea I have bisecting again, but only marking commits good if your config fully works and is updating, too - because my best guess is that both just the square rendering and the memory problem are different issues from the big input events PR, and aa6e61b just "uncovered" the other one...?

Otherwise, I still have both fix suggestions (A, B) from the original issue, I wasn't sure whether there would be any difference, but it might be worth a shot to try and see whether it makes any for you.

Caellian commented 6 months ago

Can you provide some additional info:


I think the commits with the small black square don't count because conky is stuck in a loop. So the aa6e61b only allows the problem to happen. The actual last working commit should be: f52c5cbd7b11ac36678dcd2b87016fbe8477290e

I suspect there's a memory leak in code that does error reporting (introduced in f52c5cbd7b11ac36678dcd2b87016fbe8477290e) and openbox owns memory allocated by conky somehow. This would be indicated by a lot of errors being reported with "-DD" option. Though I've just checked and it seems that all paths clean up the appropriate memory allocations

Alternatively, there seems to be a memory leak in openbox if conky causes it to spam errors.

I'll try running conky under OpenBox and replicating this in a week (or two), I've got two exams coming up.

Penaz91 commented 6 months ago

Did you mark the commits with just the square rendering as good for your second bisect? In that case, the only idea I have bisecting again, but only marking commits good if your config fully works and is updating, too - because my best guess is that both just the square rendering and the memory problem are different issues from the big input events PR, and aa6e61b just "uncovered" the other one...?

Yes, I have marked those commits as "good" because they weren't triggering the issue. Turns out (using conky -DD) that conky actually freezes and doesn't start completely. Apologies for the misguidance.

I have bisected again, this time skipping the commits that are not strictly connected to the test I'm performing (seeing if something makes OpenBox eat all the available memory), sadly it could only narrow it down to a bunch of commits. I'm attaching the log below, which a couple notes:

bisect.log

All the other skipped commits just show a tiny black square.

Otherwise, I still have both fix suggestions (A, B) from the original issue, I wasn't sure whether there would be any difference, but it might be worth a shot to try and see whether it makes any for you.

For sake of curiosity, I decided to test both fix suggestions in the current HEAD and recompile. In both cases, the issue persists.


Can you provide some additional info:

  • Do you start conky from ~/.config/openbox/environment or ~/.config/openbox/autostart?

    • If so, does conky use up a lot of memory instead if you start it from a terminal?
  • Which compositor are you using?
  • Run latest conky with "-DD" option and paste the log if there's errors. Describe when they happen (focus change, mouse movement, non-stop, ...).

Since some crashes were happening in the latest releases, I moved conky from being started in ~/.config/openbox/autostart to a process supervisor (which gives up after 3 crashes). After this issue started happening, I stopped letting conky auto-start.

All the tests have been performed by starting conky through terminal, OpenBox still ends up being the memory hog.

My compositor of choice is Picom, but it has been turned off during these tests (I also tried using only startx, xterm and openbox without any change in my findings)

I suspect there's a memory leak in code that does error reporting (introduced in f52c5cb) and openbox owns memory allocated by conky somehow. This would be indicated by a lot of errors being reported with "-DD" option. Though I've just checked and it seems that all paths clean up the appropriate memory allocations

I started conky -DD and there are no errors, it just stops updating. (Used conky -DD > conky.log 2>&1 and then used tail -f in a TTY to follow the log)

Alternatively, there seems to be a memory leak in openbox if conky causes it to spam errors.

I tried to see if starting openbox --debug --debug-focus --debug-session --debug-xinerama through startx (redirected to a log) does something, but it seems there is nothing happening. Again it just stops working. Not sure if I redirected stderr to stdout though, I will retry and edit this reply.

EDIT: Confirmed that both OpenBox and Conky "just stop working" without spamming errors in their logs.


I'll try running conky under OpenBox and replicating this in a week (or two), I've got two exams coming up.

Best of luck for your exams!

Caellian commented 6 months ago

Recreated on main (4eecf3b), with Arch+lemurs+openbox+conky+lxappearance as described. With BUILD_MOUSE_EVENTS disabled.

The CPU usage of openbox jumps to 70%. Then,

This occurs at the moment the "Apply" button is clicked, changing icon theme has the same effect too.

In case the conky if frozen, further changing of icon/cursor theme doesn't do anything (though CPU is still at 70%). I have an alright laptop, so 70% only seems viable if it's a busy wait. Conky cpu usage is quite high (50%) as well, so it seems like conky's bouncing back-and-forth between 2 render calls with some openbox code. I'm assuming it's event propagation.

EDIT: Conky propagated ClientMessage event which is "primarily used for transferring selection data" but also interclient communication. Not sure why LXAppearance threw that curveball at us, but skipping propagation of that event fixes the issue. I'm working on a PR to fix this and avoid some other event types we shouldn't be forwarding.

Penaz91 commented 6 months ago
  • the rest of openbox freezes to input, but apps seem to continue rendering fine (btop as well as conky) and the memory usage grows.

This is what usually happens to me.

Not sure why LXAppearance threw that curveball at us

LXAppearance is just a way I managed to reliably reproduce the issue I was randomly having throughout my day: I was finding my OpenBox session frozen and couldn't move windows or interact with them (even after just leaving the laptop on a lockscreen for about an hour while having lunch).

It may be that LXAppearance throws some events in order to "instantly change the mouse cursor" instead of requiring a logout/login.

Conky propagated ClientMessage event which is "primarily used for transferring selection data" but also interclient communication

I'm wondering (out of mere curiosity) what kind of events Conky may be propagating while a computer is not being used by someone, as well as why the issues started now (provided my last bisecting is correct).


Little Offtopic: hope your exams went well, Caellian!

Caellian commented 6 months ago

I'm wondering (out of mere curiosity) what kind of events Conky may be propagating while a computer is not being used by someone, as well as why the issues started now (provided my last bisecting is correct).

None on my dev branch currently. If another client sends an event our way, that we're not processing, we forward it to the desktop. Although, the intent is mostly to forward mouse movements and such to the desktop so that desktop widgets (e.g. Plasma) and selection of desktop icons work even if conky is being drawn over them. Conky doesn't handle drag-n-drop so the code didn't expect ClientMessage and forwarded it to the desktop and it was returned to conky.

If you have privacy concerns, it's far from telemetry. A malicious conkyrc mouse hook could track the mouse position only while the cursor is directly over conky. I wanted to add keyboard event support as well, but that's almost impossible to do on wayland without consuming all input while the layer-shell app is running so I'm holding off on that. In case that was added, it would probably capture globally but only forward to the script if the cursor is over conky. Basically, I'd like to make conky support Rainmeter-like functionality which is my motivation for the original PR.

It may be that LXAppearance throws some events in order to "instantly change the mouse cursor" instead of requiring a logout/login.

Event without handling it, the cursor changes instantly. I'm not familiar enough with GTK API (used to set the cursor by LXAppearance) internals to know what info the event is supposed contain.


hope your exams went well, Caellian!

Went so-so, I passed the 2/4, I hope to pass the last two soon.

Penaz91 commented 6 months ago

If you have privacy concerns, it's far from telemetry.

Oh no, I don't have any privacy concern over Conky. It makes sense that Conky would capture events if the mouse was over it, I was just genuinely curious about the inner workings of everything involved.

My fear is that this may be hiding something that could be the source of bugs that are harder to reproduce, that's all.

I'd like to make conky support Rainmeter-like functionality which is my motivation for the original PR.

Does this mean that Conky may move towards being more of a widget creation system (kinda like like EWW, for instance) to also create bars, etc... instead of a pure system monitor? That sounds like an interest direction to take.

Event without handling it, the cursor changes instantly. I'm not familiar enough with GTK API (used to set the cursor by LXAppearance) internals to know what info the event is supposed contain.

I admit I was extremely lucky seeing the Mouse Events commit and trying to use LXAppearance to trigger the bug I'm experiencing. Anything in the realm of "how GTK works" and similar is way outside the realm of my knowledge.

Caellian commented 6 months ago

Does this mean that Conky may move towards being more of a widget creation system (kinda like like EWW, for instance) to also create bars, etc... instead of a pure system monitor? That sounds like an interest direction to take.

Not sure, it's not up to me. Many people used weird/complicated setups in order to make conky react to cursor events so it seemed like a good idea to add support for those directly to conky. Keyboard is a bit more tricky to support due to different layouts and what not, so I don't feel keen on creating a PR with that yet. If I did however, it would be up to Brandon to decide whether it fits.

I'm also passively working on a similar project that focuses more on the interactive side and wayland, so it might be better to pass data from conky to something like it to get the best of both worlds while not making conky too bloated. As I've said though, there's a lot of limitations to what layer-shell apps can do with input in wayland currently, so my clone doesn't do much more than providing Skia bindings for Lua.