pop-os / system76-scheduler

Auto-configure CFS and process priorities for improved desktop responsiveness
Mozilla Public License 2.0
545 stars 33 forks source link

system76-scheduler 2.0.0 #78

Closed mmstick closed 1 year ago

mmstick commented 1 year ago

Closes #76 Closes #74 Closes #73 Closes #70 Closes #69 Closes #55

13r0ck commented 1 year ago

This is really cool

jacobgkau commented 1 year ago

If I run nice -n 12 stress -c 1, the process starts with niceness 12. If I switch focus away from the terminal (and wait a bit), the niceness goes down to 6; switching back to the terminal then causes it to go down to 0.

Screenshot from 2023-03-09 16-52-40

I don't think this is actually a regression, but are manual nice values (set manually as opposed to through the scheduler) supposed to work? Wasn't sure if nice in the default config's exceptions was supposed to mean this should work or not.

mmstick commented 1 year ago

I'm working on a solution for that. A way to describe that an exception should apply to all descendants of an exception, and tracking excluded PIDs.

jacobgkau commented 1 year ago

Confirmed nice not sticking isn't a regression (it didn't work to begin with before, so it's closer to working now.)

@mmstick If I comment out the background/foreground assignments in config.kdl, then none of the other assignments seem to apply anymore. Can you confirm if there's some other way that a user should turn off the background/foreground priority switching if they still want to use manual assignments? (This has come up in previous versions as a desired use case, and currently works in the released version.)

jacobgkau commented 1 year ago

It occurred to me that since exceptions aren't working anyway, maybe setting foreground and background to both have the same neutral settings would be the workaround to "not use" that part of the scheduler. If that's the case, then we might want to consider moving those two assignments to the top of the config file/section and adding an explanation to the comments that they need to be set for the rest to work.

mmstick commented 1 year ago

@jacobgkau I've been working on a solution for handling priority for exceptions, and the latest push should resolve that. The scheduler can now detect if a process was exec'd from a nice command and ignore priority assignments.

I've also fixed the issue with foreground profiles interfering with other profiles. Disabling them will no longer cause other profile to also be disabled.

jacobgkau commented 1 year ago

On 330d0c3, I'm seeing processes only get the foreground priority and never the background priority, whether they're focused or not. (This happens even after deleting the /etc/system76-scheduler/ directory; if I open e.g. Nautilus and then switch focus to Terminal, Nautilus still keeps the default foreground niceness of 0.)

nice values set by launching with nice -n ... do stick now, so that seems to be fixed.

While testing, I noticed Kdenlive (installed via Flatpak) gets a niceness of -3, which seems to be the configured niceness for desktop-environment in the default config?

jacobgkau commented 1 year ago

Seems it might be any Flatpak app getting classified as desktop-environment? LibreWolf and VS Code installed via Flatpak have similar issues.

For VS Code, that app spawns multiple code processes. I see one that gets -3 (desktop-environment), several that get 0 (foreground), and one that gets 9 (session-services).

For LibreWolf, both the main librewolf process and the child Isolated Web Content, Privileged Content, and WebExtensions processes get -3.

mmstick commented 1 year ago

Flatpak process are now fixed. The issue was that the parent process (bwrap) was not known to the scheduler at the time it was notified of the process being created. The scheduler will now attempt to add parent processes if they're missing on adding a new process.

Foreground process management has also been refactored . The old code was not a good fit for the current logic, since processes are now managed internally in a different way.

Although foreground process detection can break if the service is being restarted at the same time pop-shell is trying to send a new foreground PID to it.

jacobgkau commented 1 year ago

I thought I saw foreground/background detection working again on 300e980, but now I'm not seeing it work anymore (tried rebooting before and after removing my /etc config, still nothing.) I might be running into the case you're aware of where it can break.

If I disable foreground/background detection (by commenting those lines out), at least some of the other assignments are still taking place now.

When background/foreground are commented out, even our deb-packaged Firefox gets -3 (desktop-environment) instead of a default 0 if I launch from its dock icon, but it gets 9 if I launch from the terminal (presumably inherited from gnome-terminal-server which is classified as session-services.) Is this how it's supposed to work when foreground/background don't have assignments?

A custom assignment group called custom (in /etc/system76-scheduler/assignments/custom.kdl) containing stress and assigned a different nice value in config.kdl does not seem to apply to a plain stress -c 1 process anymore (I thought this was working earlier in the PR.)

jacobgkau commented 1 year ago

The Kdenlive flatpak is working with foreground/background detection on 3a1617f. Nautilius seems stuck with priority 9 (which should be session-services with the default config.) Gedit stars off at priority 0 (foreground), then it also switches to 9. Seems like something still isn't working right.

mmstick commented 1 year ago

@jacobgkau The new change to the config makes it possible to add exclude rules to a profile, and the config now has an exclude that'll fix gedit and nautilus.

jacobgkau commented 1 year ago

It's taking more time than expected for foreground/background priorities to apply when I switch between e.g. Nautilus and Firefox, sometimes 3-5 seconds or longer. On the current stable system76-scheduler, these usually apply within a second.

jacobgkau commented 1 year ago

On 2546566, it's still taking 3 updates of top to reflect foreground niceness changes, versus 1 on the released version of system76-scheduler.

mmstick commented 1 year ago

Delays for foreground process assignments, and assignments of new processes, is now fixed. Changes will fix the main loop being put to sleep between process changes, and remembers assigned priorities so that it won't have to look up a compatible profile a second time.

mmstick commented 1 year ago

stress's cgroup matches the session services profile because it's assigned by the *.scope rule. I think then that we either remove that cgroup rule or expect stress be treated like a service.

All processes inherit their priorities from their parent before they are spawned, so it's expected that any process spawned from gnome-shell will have the same priority as gnome-shell until the scheduler overrides it.

I can check if foreground profile is getting assigned to an exempted process somewhere. nice can be unreliable because it's short-lived and the PID is unchanged when it morphs into the command defined for it.

jacobgkau commented 1 year ago

stress's cgroup matches the session services profile because it's assigned by the *.scope rule. I think then that we either remove that cgroup rule or expect stress be treated like a service.

session-services should have a niceness of 9, right? I don't understand if this is supposed to explain why it was being set to 0. (Starting with 9 briefly, I assumed, was because I was starting stress from the terminal and gnome-terminal-server seems to be a session-services process.)

mmstick commented 1 year ago

@jacobgkau With the recent change, are you still seeing issues with stress and nice?

jacobgkau commented 1 year ago

Just talking about stress (not nice), with the default configuration, I'm seeing it always get -1 now, whether the terminal window is focused or not. top, on the other hand, switches between -1 and 6 based on if any terminal window is focused or not. Is this test case valid? (I'm expecting stress to also switch to 6 when I focus a different application, but it's not.)

mmstick commented 1 year ago

@jacobgkau The issue with stress's dispatched child processes was that execsnoop-bpfcc cannot see processes created from a fork that didn't also call exec after the fork. I've made some changes to look for missing descendants of processes when assigning a new process, or when setting the foreground process. So all of the children of the main stress process are getting their foreground and background assignments now.

I've also set a 10 second delay on service start to give system processes enough time to get their cgroups assigned to them by systemd, and a fix to the priority assignment logic to avoid assigning a priority from a condition when the cgroup is not yet defined. So any processes lacking a cgroup will get their cgroups set on the next process map refresh (up to 60s by default).

jacobgkau commented 1 year ago

At first, the most recent commit seemed to be good. Focus detection with stress is working more reliably reliably now. A custom profile also applies (I see stress -c 1 get the foreground or background niceness briefly, but the custom niceness applies after no more than 1 second.) The nice command with stress seems to apply on top of the foreground value (-n 18 gave a niceness of 17), and is at least behaving consistently now.

I still started seeing stress -c 1 get set to niceness 0 (with no profiles configured for that niceness) eventually. The issue stopped when I reset my configuration (removed the /etc configuration, leaving only the default /usr/share configuration.) I replicated the issue by only creating the following file in /etc/system76-scheduler/assignments/custom.kdl:

assignments {
        custom {
                stress
        }
}

I do not need to assign the custom profile to anything in config.kdl; simply having the above file present causes stress -c 1 to get set to a niceness of 0. Perhaps this was user error if people aren't intended to create profiles that aren't assigned in the configuration, but my impression was that if I didn't actually configure a profile for an assignment group, then it wouldn't do anything. (I had been commenting out the custom profile in config.kdl to try and disable it for testing, but had not been commenting out the entire assignments/custom.kdl file.) Should I consider this expected behavior, or is this an issue with the config handling?

mmstick commented 1 year ago

So we should ignore any profiles that have no properties defined? By default, a profile that doesn't assign a nice will get a nice setting of 0.

Profiles can be created in assignment configs to supplement the assignments defined in the main config. They'll inherit the properties of a profile of the same name if it was defined in the main config. Otherwise the profile is unique to that config.

jacobgkau commented 1 year ago

Profiles can be created in assignment configs to supplement the assignments defined in the main config. They'll inherit the properties of a profile of the same name if it was defined in the main config. Otherwise the profile is unique to that config.

Maybe I'm not understanding the purpose of assignment configs. My impression based on the file/directory names was:

You're saying that profiles in the assignments directory (in files containing a top-level assignments block) are equivalent to ones in the process-scheduler/assignments block of config.kdl, they just get merged/inherited when their names match? Is it possible to define other properties like niceness in the assignments directory? What's the purpose of having most assignments defined separately from the main config file by default?

jacobgkau commented 1 year ago

So we should ignore any profiles that have no properties defined? By default, a profile that doesn't assign a nice will get a nice setting of 0.

If I was attempting to make a process get a niceness of zero, my first instinct would be to create an assignment/profile and set it to 0 explicitly, not to create a profile and not set the niceness. I think this behavior should at least be documented, if we don't change it (I don't see it in the readme right now, and it's not defined in any default config or default config comments.)

mmstick commented 1 year ago

That is correct. The same logic is used to parse the assignments and exceptions elements in both the assignment configs and the main config. You can apply niceness properties not only to profiles in assignments configs, but also individual rules in the profiles.

The difference for profile assignments in the main config is that the pipewire, foreground, and background profiles have special meaning to the scheduler; and profiles defined in the main config are memorized so that assignment configs can inherit their values when they use those names.

The assignments config files are a secondary source to extend the main config for cases where you may not want to modify the main config. This way we could have distribution and desktop-specific assignment config files. Providing a default config specific to Pop!_OS, or a config specific to cosmic-de/cosmic-gnome.

I'm open to ideas for improving the config syntax and structure though if you have any.

jacobgkau commented 1 year ago

Regarding the config structure:

Maybe both of these could be improved by keeping the current config.kdl structure but calling the additional directory something that corresponds with process-scheduler, like process-scheduler.d or processes.d. That might make it more intuitive that the top-level assignments {} and exceptions {} blocks in the standalone files are in fact part of the process-scheduler {} configuration block in the main config file (or at least correspond with that level of configuration.)

Thoughts/counterpoints?

mmstick commented 1 year ago

The exceptions directory no longer applies, but it's a good point that assignments isn't the right name for placing these config files. I'm fine with changing it to process-scheduler.

jacobgkau commented 1 year ago

It seems like 54978313367222e810acd25dc13dda3c5e2a9743 is only partially working-- it does allow a custom group that doesn't have a niceness defined to get the foreground niceness, but it doesn't allow that group to get the background niceness when it's not focused. I don't consider this blocking now that I know it's a configuration error, but wanted to point it out.

I also noticed if I define a niceness inline under process-scheduler/ and also in config.kdl, the one in process-scheduler/ takes precedence. I suppose this is expected behavior if the purpose of process-scheduler/ is to allow extending config.kdl.

Regarding PipeWire integration, I noticed that when I start playing audio in Firefox, firefox-bin gets the pipewire group niceness, but Isolated Web Content still gets background/foreground niceness. I was able to recreate audio stuttering on a 4-core/8-thread machine by having Firefox play a YouTube video in one workspace and then focusing a stress -c 8 terminal window in another workspace. However, I was still able to recreate some stutter after manually using renice -6 on Isolated Web Content (it may have been better, but only slightly.)

I tried to test this in Bitwig Studio and found that neither BitwigStudio, bitwig-studio, or BitwigAudioEngine were detected as part of the pipewire group even when I started playing audio (BigwigAudioEngine only exists while playing audio, so it's the one I expected to be detected.) I tested with Bitwig set to its PipeWire and PulseAudio options with no difference in behavior.

The feature does work as expected with Resolve, Totem, and MPV (all with a single process). Does this sound like PipeWire integration is working as expected, or do you want to look into inheritance and/or detection any more? (I'm not sure what the limitations are on the PipeWire side for reporting those processes.)

mmstick commented 1 year ago

Pipewire gives only the PID of the process that's directly connected to it, which is typically the same process that's decoding and playing the audio.

It sounds like Firefox is connected to Pipewire from its main process, but the actual decoding is happening in a background process. It should be resolvable with inheritance.

I'm not sure about BitWig at the moment. I'll have to try that out.

mmstick commented 1 year ago

I'm fine with releasing as is. It's a substantial improvement and these remaining flaws are easily fixed in a backwards-compatible way.

13r0ck commented 1 year ago

Can we please get #![deny(missing_docs)] added to the crates here?

mmstick commented 1 year ago

You mean to execsnoop?

13r0ck commented 1 year ago

The daemon would be nice too, but as that isnt a library I don't think it is as necessary

mmstick commented 1 year ago

execsnoop would be the only component that's not directly tied to the daemon

13r0ck commented 1 year ago

Yep for sure, I think documentation would be nice for if someone else (internal or community) wants to make a PR.