FelixKratz / SketchyBar

A highly customizable macOS status bar replacement
https://felixkratz.github.io/SketchyBar/
GNU General Public License v3.0
5.62k stars 86 forks source link

Bar slows down after repeated display changes #336

Closed oliversturm closed 1 year ago

oliversturm commented 1 year ago

Please note that the title is a guess. I notice this redrawing behavior when I plug or unplug external screens (which is how I did it for the video below), but it also happens when the Mac comes back from sleep. In that case, since all screens are already attached, the bar redraws several times on all three screens (two external, plus the internal display).

My configuration is super simple, I use standard values for almost everything. What remains is this:

FONT="PragmataPro Liga"
BAR_COLOR=0xff239cd3

bar=(
  color=$BAR_COLOR
)

sketchybar --bar "${bar[@]}"

defaults=(
  updates=when_shown
  icon.font="$FONT:Bold:16.0"
  label.font="$FONT:Semibold:13.0"
)

sketchybar --default "${defaults[@]}"

Any guesses what else I may try? Assuming this is not standard behavior? Oh, I'm running Ventura 13.2.1 on my M2 Mac (edit: and SketchyBar 2.14.0).

https://user-images.githubusercontent.com/4730969/223405525-9ff9e709-2279-4072-9622-bd7af8d7425a.mp4

FelixKratz commented 1 year ago

I have tried to reproduce with this config:

BAR_COLOR=0xff239cd3

bar=(
  height=40
  color=$BAR_COLOR
)

sketchybar --bar "${bar[@]}"

defaults=(
  padding_left=5
  padding_right=5
)

sketchybar --default "${defaults[@]}"

SPACE_ICONS=("1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15")

sid=0
spaces=()
for i in "${!SPACE_ICONS[@]}"
do
  sid=$(($i+1))

  space=(
    associated_space=$sid
    icon="${SPACE_ICONS[i]}"
  )

  sketchybar --add space space.$sid left    \
             --set space.$sid "${space[@]}"
done

This is what it looks like when disconnecting and connecting a display:

https://user-images.githubusercontent.com/22680421/224250018-4f5388a0-757d-46df-87be-7cf3b7b5faba.mp4

Although the video doesn't capture it, there is a slight flickering on display connect, disconnect because I reset all bars on these events. Could you try with the above config and sketchybar started from the commandline to see if there is some output or even a crash?

oliversturm commented 1 year ago

So... this was a bit more difficult than I thought. Couple of unrelated findings:

Now I just built from source and suddenly I can use -c. Back to the original problem.

I ran with your test config file and observed the behavior. The video below shows it pretty well -- I detach and reattach one of my two external screens repeatedly. I had the impression that things got worse over time, and I was reminded that I previously saw Sketchybar consume over 1GB of memory... so I recorded the process info in Activity Monitor together with the reaction of the bar itself. Please let me know anything else I can tell you -- would be great to find out what this is all about!

https://user-images.githubusercontent.com/4730969/224287156-d5676c9f-cd0b-4301-88dd-20967c637ffa.mp4

FelixKratz commented 1 year ago

So... this was a bit more difficult than I thought. Couple of unrelated findings:

  • Why no --help? I had to read the source to find out how to pass an option for a non-default config file.

It is not documented because it is not yet released. I have added this feature a few days ago: https://github.com/FelixKratz/SketchyBar/commit/c406cfab4d7128ef85c08baa7222b26ef394265c

Feel free to create a PR for the help argument. It certainly is a good idea.

  • The brew build is broken. It does not accept the -c or --config options, and it also never outputs messages like "Too few arguments for argument 'config'", i.e. those that use printf. No clue why... brew version says 2.14.1 but it clearly has issues.

As I said, it is an unreleased, undocumented feature I have added recently. This is the difference between current brew release and current master: https://github.com/FelixKratz/SketchyBar/compare/v2.14.1...master

Now I just built from source and suddenly I can use -c. Back to the original problem.

I ran with your test config file and observed the behavior. The video below shows it pretty well -- I detach and reattach one of my two external screens repeatedly. I had the impression that things got worse over time, and I was reminded that I previously saw Sketchybar consume over 1GB of memory... so I recorded the process info in Activity Monitor together with the reaction of the bar itself. Please let me know anything else I can tell you -- would be great to find out what this is all about!

Possibly a memory leak. There is a slow memory leak in the current brew release which I fixed here (which is probably unrelated): https://github.com/FelixKratz/SketchyBar/commit/ce4724ff2236e8166a99ef8e88bca02e397dc599

I will see if I can reproduce when detaching and re-attaching frequently. But some small amount of flickering is perfectly normal.

FelixKratz commented 1 year ago

Found a memory leak that appears when not setting the bar to be sticky. Can you try with the new patch: 8bfad71 ?

FelixKratz commented 1 year ago

Created a new brew release just now, should be fixed.

oliversturm commented 1 year ago

Sorry, was busy with other stuff. I have installed the latest from brew and it certainly doesn't make anything worse. I can still see that the bar flickers off and on several times when screens come and go -- will see over time whether the behavior degrades as it seemed to do previously.

There's still something odd about the brew build. I built your latest from master and I get

> bin/sketchybar --help
[!] Unknown domain '--help'

When I do the same with the updated brew version, I get

> sketchybar --help

... IOW, nothing. Weird.

The good news is that the brew version now outputs an error when I use -c without a further argument -- sorry, I hadn't checked the commit history to notice that this was a recent addition.

FelixKratz commented 1 year ago

There's still something odd about the brew build. I built your latest from master and I get

> bin/sketchybar --help
[!] Unknown domain '--help'

When I do the same with the updated brew version, I get

> sketchybar --help

It has nothing to do with the brew version and the compiled version. You get the invalid domain error when sketchybar is running. If it is not running you get nothing. Thats simply because all arguments except -v and -c are passed as a message to the main sketchybar process. The main process responds if the domain exists or any other proper response. If there is no main process the message is not send and there is no response. Hence nothing will be printed. I could print an error that sketchybar is not running.

oliversturm commented 1 year ago

You get the invalid domain error when sketchybar is running. If it is not running you get nothing.

Oh, I see. Fair enough. Haven't read all the source code yet ;)

oliversturm commented 1 year ago

After a while -- I can still see this problem getting worse over time. As I mentioned before, the flicker does not only occur when I attach or detach screens, but also when the screen(s) come back on after they went to sleep (when the screen is locked in my absence). This allows me to observe, over the course of a day, how the behavior changes.

On a couple of occasions I have also seen that the bar is not visible anymore after I detached the external screens. In that situation the bar is still running, but I need to kill it to actually see it again.

Finally, I think there may be more memory leaks. I'm getting in the habit to restart the bar once a day or so to keep it running properly.

FelixKratz commented 1 year ago

What you could do to test the hypothesis of further memory leaks is to attach the xCode leak instrument to sketchybar. It should show you if the process has actually leaked memory and how much:

Screenshot 2023-03-20 at 12 09 04 Screenshot 2023-03-20 at 12 09 36

I will have a look what could cause sketchybar to not be shown anymore on some display events.

oliversturm commented 1 year ago

Played around with Xcode a bit (I don't normally work with it) and managed to run the profiler. No leaks found so far, but for fun I tried to detach and attach a screen and the allocation graph looks like this:

image

I can't judge if that's a problem -- there's lots of RAM available -- but it seems a bit unusual... and since we're talking basic C, no garbage collection or other modern magic, should this extra allocation occur? From previous cursory glances at the activity monitor, I believe this will happen every time I detach/attach.

Anyway, I'll leave the bar running with profiler for a while and see what I find.

FelixKratz commented 1 year ago

Yeah, so the extra magic that happens in the background is vm allocations by the system. Specifically the WindowServer allocates vm space for the backing store of the windows. AFAIK this is handled by the system and since there are no leaks (which I would be responsible for) there seems to be something odd going on with the backing stores not being released properly (or at least they are not released immediately).

I have found this one function:

extern CGError SLSSetWindowToReleaseBackingOnOrderOut(int cid, uint32_t wid, bool release);

but setting this to true does not change anything regarding the mach_vm allocations.

oliversturm commented 1 year ago

Several findings. I'm watching the "persistent" and "# persistent" columns now, since I understand that these show memory and allocations that have not been freed yet. I assume that we would expect to see these numbers remain largely static, since there is no reason to keep allocating (and not freeing) additional memory after an initial startup period.

sketchybar --add alias "Control Center,Clock" right
sketchybar --add alias "Control Center,WiFi" right
sketchybar --add alias "Control Center,Sound" right
sketchybar --add alias "AlDente,Item-0" right
sketchybar --add alias "Little Snitch Agent,Item-0" right

I noticed that the "# persistent" column was growing by one each second even if I was not doing anything. I believe the update frequency for the aliases is once per second, so I restarted without any aliases and this problem went away.

Regarding your last comment -- I had been watching the row "all heap & anonymous VM" so far. But I repeated some of my tests and watched the rows "all heap allocations" and "all anonymous VM" instead. The value in the row "all anonymous VM" remained the same, or varied by very small values (it went up by one at some point when I wasn't doing anything). It's mainly the value "all heap allocations" that grows.

FelixKratz commented 1 year ago

You can switch to the call tree view, there you see where exactly those allocations come from: Screenshot 2023-03-20 at 16 26 33 There it is clear that the system (SkyLight, CoreGraphics) does these allocations during the CGContextClearRect framework call. These bytes are not leaked and the system (through vm) has control over these allocations.

oliversturm commented 1 year ago

Right, thanks. I can see that... pretty disconcerting behavior in any case, if you see memory consumption of this tiny bar creep beyond the 1GB mark after a while. And I don't think I have any other app running that behaves this way.

Anyway... if you say there's no leak I'll gladly believe you. Then the weird flickering redraw behavior must have a different reason, but it's definitely still there.

FelixKratz commented 1 year ago

It could still be that this is in fact a problem... but I am not sure how to solve it. It seems that the memory is not properly given back on window destruction... I will investigate further.

FelixKratz commented 1 year ago

I think I have found the problem. Calling SLSReleaseWindow does not actually release the window on Ventura, while it does so on Monterey and below. The implementation seems to have changed and the deallocation of vm space is no longer performed in this function. Finding an alternative function (or figuring out why SLSReleaseWindow does not properly release the window) will probably take some time.

oliversturm commented 1 year ago

Thanks for the update! That's a weird change. Maybe a bug? But that would presumably affect many applications...

Edit: what strange API function is SLSReleaseWindow? For some reason I find almost no results when I search for it, with the exception of a header file from the yabai project. Is it undocumented?

Second edit, with apologies. It's a Skylight function, alright. But... still I have no clue where to find any docs. Very strange.

Third edit: private framework, I see... such fun. Sorry for all the questions.

FelixKratz commented 1 year ago

Ok good news, I have looked through the implementation of these private functions and found that there is a retain problem caused by calling SLSTransactionOrderWindow on a window. This is a problem in the SkyLight framework, but since it is a private framework it is not really a bug, more like an unexpected behavior. I was able to isolate, reproduce and mitigate the problem in a minimal code example. In SketchyBar it is not so simple as to replace SLSTransactionOrderWindow with its immediate counterpart SLSOrderWindow since then flickering can be seen occasionally. Maybe there is another way to fix the retain problem while keeping SLSTransactionOrderWindow around... I will have to do some more investigations.

oliversturm commented 1 year ago

I just left sketchybar running since yesterday around the same time -- 24 hours, but the machine was sleeping overnight. Just now it became unbearable, for a reason that may or may not be related to the issues we've been discussing.

What happens is that sketchybar slows down switching between spaces. When I Cmd-Tab between apps that run in different spaces, there is a very noticeable delay. The delay is two-part. First, nothing happens for a short while - a good part of a second, I think. Then, the space switches, and the second part of the delay is the slow redrawing of the bar itself. You've seen my screenshots, I have these nine space symbols on the bar, and I can basically see them appear one by one. The process altogether doesn't take super long, but it's very annoying anyway if you need to switch back and forth.

Anyway... I was going to wait until I can take a screenshot of the bar using 1GB of memory, but now I killed it at 545MB and after the restart space switching is fast and snappy again.

image

I'm not sure whether all this comes down to the same issue(s), and I wonder how I'm apparently the only one with these persistent problems.

FelixKratz commented 1 year ago

Yeah this sounds like what I was able to reproduce by intentionally forcing window refreshes in a loop. I think this will be fixed once I get behind all of this.

Maybe your monitor connects and disconnects several times when the system sleeps which provokes this problem.

oliversturm commented 1 year ago

Maybe your monitor connects and disconnects several times when the system sleeps which provokes this problem.

That is possible, but I also observe that things get worse and worse during the day as I lock and unlock the machine. When I lock it, the monitors go to sleep, and then the reconnect when I come back and unlock. Then I usually detach the machine a couple of times during a day... it adds up.

FelixKratz commented 1 year ago

I think this commit should fix the problem: 33f815c (funnily enough, if you look at the changes)

Sadly I did not find a way to achieve this while keeping the transactional window ordering around. Can you test the latest master?

brew uninstall sketchybar
brew install sketchybar --head
brew services restart sketchybar
oliversturm commented 1 year ago

I repeated some of my previous tests with the master version and there are definitely improvements. Running with only the space indicators and no aliases, here is the memory curve after a display sleep/wake cycle:

image

You can see something that never happened before: memory consumption decreased at the point where the screen was detached.

Now I tried to run with two aliases (wifi and clock), which I also did previously. First, I noticed that memory consumption (and the "# persistent" value) increased when I moved the mouse between displays/spaces -- much like it did before. But one thing was new: I also saw that the "# persistent" value sometimes decreased when I stopped moving the mouse for a while. And then, the display sleek/wake cycle again resulted in a decrease of memory consumption:

image

I'll leave the bar running now and check back on the overall consumption and behavior in a little while.

I did not find a way to achieve this while keeping the transactional window ordering around

I'm not sure how this change would manifest itself during use. I'd still like to know how you found out what functions like SLS(Transaction)OrderWindow do -- can't find docs or descriptions in search engines!

FelixKratz commented 1 year ago

Okay that sounds promising so far.

I'm not sure how this change would manifest itself during use. I'd still like to know how you found out what functions like SLS(Transaction)OrderWindow do -- can't find docs or descriptions in search engines!

There is no description or documentation for these functions. These are functions apple has designed for use in their own applications, like their own menubar and such. These functions can be very powerful and some things simply can not be done with the public frameworks apple exposes.

The signatures of these functions and their behavior can be constructed by reading the binary instructions. Basically: everything is open source if you know how to handle raw binary instructions (or assembly). Actually this the reason for this project altogether since I was fascinated by yabais use of private frameworks and wanted to have my own playground.

oliversturm commented 1 year ago

The signatures of these functions and their behavior can be constructed by reading the binary instructions

No worries, I understand what you're saying. But if all you have is function signatures, I wonder how you know what the function with Transaction in the name does compared to the one without... I've probably not spent enough time thinking about it, but I haven't had any ideas yet. And if somebody made the effort to read the disassembly of these functions, I'd still hope they'd write down their results somewhere for others to find!

FelixKratz commented 1 year ago

But if all you have is function signatures, I wonder how you know what the function with Transaction in the name does compared to the one without...

The thing is, that this is not all I have. I have the framework binaries that contain the code which is executed when these functions are called. So I really have everything the functions do, e.g. here is what SLSOrderWindow does: Screenshot 2023-03-22 at 13 58 35 It basically calls SLSOrderWindowList, such that this would be the next function one would have to understand and deduce the functionality off.

I think there is still some other retain problem in the event responder chain, but the leaked bytes are nowhere near as many as in the SkyLight retain problem.

oliversturm commented 1 year ago

... if somebody made the effort to read the disassembly of these functions

As I said! Kudos for making that effort. You should start a wiki and document this stuff, so others don't have to do it all over again.

So far, no significant change. I started this morning with this (after the tests described above):

image

Right now I have this:

image
oliversturm commented 1 year ago

One more update this morning: things are still good. Here's the same activity monitor screenshot:

image

The thing that has not changed is the original subject of this report:

Bar redraws repeatedly when screen config changes

This problem is less noticeable now because the flickering doesn't take as long as it did previously. Certainly much easier to ignore, unless you're keen to track this down.

I'll leave it to you to close this issue or keep it open for further investigation. Please let me know if I can supply any more information.

FelixKratz commented 1 year ago

Thanks for the update.

The flickering you see on monitor connect and disconnect is an artifact created by how I handle display added/removed/resized events and could be circumvented by implementing a better display change system, i.e. I currently destroy all bars and recreate them from scratch on any of those events. The thing that causes the flickering is that there usually are multiple events that correspond to a singular display change (e.g. display added, display moved and eventually even display resized right after each other on a single display connect leading to all bars on all displays being destroyed and recreated several times). This is also the reason why display events would cause the massive memory problems related to the skylight retain problem that we discussed.

oliversturm commented 1 year ago

Maybe a basic "debouncing" of the various events would help, without having to track specifically through the sequence -- assuming the sequence is perhaps not always exactly the same, especially with multiple screens generating events at more or less the same time. When I plug my monitors in or wake up from sleep, the Mac takes several seconds to sort itself out. If Sketchybar took a half second somewhere in there to debounce events and then react once to redisplay itself, that would not add any relevant delay but probably look nicer and clean up the process. Just an idea for a pragmatic workaround.

FelixKratz commented 1 year ago

I made the display event handling less aggressive, only the display added/removed events will now lead to a full reset. Further improvements are still possible but then complexity of book keeping rises a bit.

You can try it by running:

brew upgrade sketchybar --fetch-HEAD

I will need to test this new display implementation a bit before making a new release.

oliversturm commented 1 year ago

Certainly much easier on the eye!

I tried to unplug a display, this redisplayed the bar only once, no flicker. Then I plugged the display back in, and this caused a multi-stage redisplay:

I'll leave this running and let you know if I see any new issues.

FelixKratz commented 1 year ago

The new release contains these fixes. To switch back to the stable releases simply uninstall and install sketchybar again without the --head argument.

koekeishiya commented 11 months ago

@FelixKratz

Ok good news, I have looked through the implementation of these private functions and found that there is a retain problem caused by calling SLSTransactionOrderWindow on a window. This is a problem in the SkyLight framework, but since it is a private framework it is not really a bug, more like an unexpected behavior. I was able to isolate, reproduce and mitigate the problem in a minimal code example. In SketchyBar it is not so simple as to replace SLSTransactionOrderWindow with its immediate counterpart SLSOrderWindow since then flickering can be seen occasionally. Maybe there is another way to fix the retain problem while keeping SLSTransactionOrderWindow around... I will have to do some more investigations.

Is SLSTransactionOrderWindow still broken on Sonoma, and have you seen if SLSTransactionOrderWindowGroup exhibhits the same issue?

FelixKratz commented 11 months ago

@koekeishiya

Is SLSTransactionOrderWindow still broken on Sonoma, and have you seen if SLSTransactionOrderWindowGroup exhibhits the same issue?

Sonoma (or some newer version of Ventura) has fixed the memory problems with SLSTransactionOrderWindow. And SLSTransactionOrderWindowGroup does work fine as well.

Maybe I will start using it again for window ordering on Sonoma and up.