wwmm / fastgame

Optimize system performance for games
GNU General Public License v3.0
76 stars 5 forks source link

IOT crash (Arch Linux) #5

Closed gardotd426 closed 1 year ago

gardotd426 commented 1 year ago

This occurs with both a system installation via pacman and a manual build built myself.

Terminal output:

fastgame

(fastgame:2040741): Gtk-WARNING **: 21:14:33.419: Theme parser error: gtk-dark.css:5688:3-9: No property named "height"

(fastgame:2040741): Gtk-WARNING **: 21:14:33.432: Theme directory actions/16-Dark of theme ZafiroCircle has no size field

(fastgame:2040741): Adwaita-WARNING **: 21:14:34.013: Using GtkSettings:gtk-application-prefer-dark-theme with libadwaita is unsupported. Please use AdwStyleManager:color-scheme instead.
/usr/include/c++/13.1.1/bits/stl_vector.h:1125: constexpr std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = std::__cxx11::basic_string<char>; _Alloc = std::allocator<std::__cxx11::basic_string<char> >; reference = std::__cxx11::basic_string<char>&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
[1]    2040741 IOT instruction (core dumped)  fastgame

And yes, I have indeed tried running sudo glib-compile-schemas /usr/share/glib-2.0/schemas like mentioned in #4.

wwmm commented 1 year ago

IOT instruction (core dumped)

It is the first time I see a program crashing with such a warning. Try to run it in debug mode so we can see a little more information G_MESSAGES_DEBUG=fastgame fastgame.

wwmm commented 1 year ago

As you are able to do a manual build try to see if there is anything interesting in gdb's output. Or maybe just executing sudo coredumpctl info after the crash.

wwmm commented 1 year ago

Are you using zsh? https://stackoverflow.com/questions/75842144/can-anyone-explain-what-iot-instruction-core-dumped-refers-to

gardotd426 commented 1 year ago

I am using zsh, but it's not the cause of the crash.

Using bash, the same thing happens, and I get this:

❯ src/fastgame

(fastgame:2075092): Gtk-WARNING **: 23:03:29.126: Theme parser error: gtk-dark.css:5688:3-9: No property named "height"

(fastgame:2075092): Gtk-WARNING **: 23:03:29.151: Theme directory actions/16-Dark of theme ZafiroCircle has no size field

(fastgame:2075092): Adwaita-WARNING **: 23:03:29.804: Using GtkSettings:gtk-application-prefer-dark-theme with libadwaita is unsupported. Please use AdwStyleManager:color-scheme instead.
Segmentation fault (core dumped)

I'll see about getting you a gdb trace. I thought maybe it was an issue with me being on GCC 13 since I'm on Arch, but I tried compiling with clang and GCC 12, and they all build, but they all coredump when I try to run the program

gardotd426 commented 1 year ago

It is the first time I see a program crashing with such a warning. Try to run it in debug mode so we can see a little more information G_MESSAGES_DEBUG=fastgame fastgame.

I might have found the issue. I'm using the liquorix kernel, and G_MESSAGES_DEBUG=fastgame src/fastgame gives me:

(process:2075713): fastgame-DEBUG: 23:05:44.596:    fastgame.cpp:17 fastgame version: 0.2.0
(process:2075713): fastgame-DEBUG: 23:05:44.597:    fastgame.cpp:27 main: locale directory: /usr/local/share/locale

(fastgame:2075713): Gtk-WARNING **: 23:05:44.616: Theme parser error: gtk-dark.css:5688:3-9: No property named "height"

(fastgame:2075713): Gtk-WARNING **: 23:05:44.629: Theme directory actions/16-Dark of theme ZafiroCircle has no size field

(fastgame:2075713): Adwaita-WARNING **: 23:05:45.207: Using GtkSettings:gtk-application-prefer-dark-theme with libadwaita is unsupported. Please use AdwStyleManager:color-scheme instead.
(fastgame:2075713): fastgame-DEBUG: 23:05:45.236:   application_ui.cpp:343  application_ui: Icon Theme ZafiroCircle detected
(fastgame:2075713): fastgame-DEBUG: 23:05:45.237:   presets_menu.cpp:47 presets_menu: user presets directory already exists: /home/matt/.config/fastgame
(fastgame:2075713): fastgame-DEBUG: 23:05:45.241:   util.cpp:39 the file /proc/sys/kernel/sched_child_runs_first does not exist!
Segmentation fault (core dumped)

I'm guessing fastgame is looking for something only CFS does or something? I though lqx used CFS but I'm not 100% sure. Either way I have a linux 6.3 cfs kernel build I can reboot into, I'll see if the issue persists.

wwmm commented 1 year ago

I'm guessing fastgame is looking for something only CFS does or something?

I did not consider that this parameter existed only when CFS is used. I have to add some safety checks when reading things like this.

wwmm commented 1 year ago

I have updated the master branch. Let's see if it still crashes while trying to read sched_child_runs_first

gardotd426 commented 1 year ago

So that fixed the sched_child_runs_first issue, but it seems like you've designed this to only work with AMD GPUs:

filesystem error: directory iterator cannot open directory: No such file or directory [/sys/class/drm/card0/device/hwmon/]

Since I used to run AMD GPUs, I know that they do have that hwmon file/interface, but Nvidia GPUs don't have hwmon in /sys/class/drm/card0/device/.

>_ ls /sys/class/drm/card0/device
aer_dev_correctable   consistent_dma_mask_bits   driver_override  iommu           modalias     reset_method      resource3_resize  uevent
aer_dev_fatal         consumer:pci:0000:0f:00.1  drm              iommu_group     msi_bus      resource          resource3_wc      vendor
aer_dev_nonfatal      current_link_speed         enable           irq             msi_irqs     resource0         resource5
ari_enabled           current_link_width         i2c-0            link            power        resource0_resize  revision
boot_vga              d3cold_allowed             i2c-1            local_cpulist   power_state  resource1         rom
broken_parity_status  device                     i2c-2            local_cpus      remove       resource1_resize  subsystem
class                 dma_mask_bits              i2c-3            max_link_speed  rescan       resource1_wc      subsystem_device
config                driver                     i2c-4            max_link_width  reset        resource3         subsystem_vendor

I'm assuming this is checked in order to modify power limits/performance levels? Nvidia does that through nvidia-smi.

wwmm commented 1 year ago

I'm assuming this is checked in order to modify power limits/performance levels?

Yes. I will fix this now.

wwmm commented 1 year ago

Yes. I will fix this now.

Wait. Is fastgame crashing or just showing the warning? As I do not have a nvidia card anymore adding support to nvidia-smi will be complicated. So if the fix is adding support to nvidia I am afraid I won't be able to fix the problem now as I originally intended :smile:

wwmm commented 1 year ago

but it seems like you've designed this to only work with AMD GPUs:

Ideally there should be a tab for nvidia just like the one for amd. As long as the amd code is just printing warnings it is not a big deal. But as I do not use nvidia anymore I did not start the code for the nvidia tab.

gardotd426 commented 1 year ago

Wait. Is fastgame crashing or just showing the warning? As I do not have a nvidia card anymore adding support to nvidia-smi will be complicated. So if the fix is adding support to nvidia I am afraid I won't be able to fix the problem now as I originally intended

Nothing is launching, it immediately crashes:

./fastgame

(fastgame:2454305): Gtk-WARNING **: 10:19:31.007: Theme parser error: gtk.css:5688:3-9: No property named "height"

(fastgame:2454305): Gtk-WARNING **: 10:19:31.037: Unable to acquire session bus: Error spawning command line “dbus-launch --autolaunch=665d77070e054d8e8b691246099e70f6 --binary-syntax --close-stderr”: Child process exited with code 1
filesystem error: directory iterator cannot open directory: No such file or directory [/sys/class/drm/card0/device/hwmon/]

If you can't add an Nvidia tab, could you not do an if/then where you check for an AMD GPU and if there isn't one present, you just skip it and provide the CPU/renicing/whatever other non-GPU enhancements fastgame provides?

wwmm commented 1 year ago

Is fastgame crashing or just showing the warning?

It probably is crashing. And I think I know where. I will make some changes to the code.

wwmm commented 1 year ago

Nothing is launching, it immediately crashes:

Ok.

If you can't add an Nvidia tab, could you not do an if/then where you check for an AMD GPU and if there isn't one present, you just skip it and provide the CPU/renicing/whatever other non-GPU enhancements fastgame provides?

It should be possible. I will try to do it.

wwmm commented 1 year ago

I have updated the master branch with a fix for the directory iteration crash. Now I have to see how to detect that the card is from AMD.

wwmm commented 1 year ago

Now I have to see how to detect that the card is from AMD.

Actually I had already put a poor check in place to avoid showing the amd tab

https://github.com/wwmm/fastgame/blob/769602d1171c1cc40a54c2f1c6bee42ec940a542/src/application_ui.cpp#L506

But the current test for amd is probably not ideal https://github.com/wwmm/fastgame/blob/769602d1171c1cc40a54c2f1c6bee42ec940a542/src/util.cpp#L134. The only thing I could think of at the time was checking for the presence of the parameter power_dpm_force_performance_level. But I am not 100% sure only amd defines it.

gardotd426 commented 1 year ago

The only thing I could think of at the time was checking for the presence of the parameter power_dpm_force_performance_level. But I am not 100% sure only amd defines it.

I'm pretty sure it's an AMD-specific thing. The kernel documentation page only lists it under amdgpu, it doesn't list it under i915 (Intel). And I know Nvidia doesn't use sysfs for user-control of their GPUs so they don't use it.

gardotd426 commented 1 year ago

This wording from the kernel docs seems to suggest it's an AMD-specific API:

The amdgpu driver provides a sysfs API for adjusting certain power related parameters. The file power_dpm_force_performance_level is used for this.

gardotd426 commented 1 year ago

Success! And there's no GPU tab just like you hoped

Screenshot_20230519_164920

gardotd426 commented 1 year ago

All the CPU, Memory and Disk options seem to be available. I don't know what the hell many of them mean since I've never gotten to use fastgame before, but since they're all just different sysfs and other APIs or other hardware standards (like PCIe APMS) I'm sure I can figure out most of them with the help of google.

gardotd426 commented 1 year ago

You want me to close this or should we wait until you are comfortable with your workaround?

wwmm commented 1 year ago

but since they're all just different sysfs and other APIs or other hardware standards (like PCIe APMS) I'm sure I can figure out most of them with the help of google.

Yes. Many of them are described in the kernel docs like for example the transparent hugepages that nowadays Arch Linux is enabling by default. The hardest part is actually figuring out if the game is benefiting from them :smile: . Some options like the cpu governor and the pcie aspm policy have noticeable effects. But there are others like transparent huge pages that are definitely very situational. On windows they are called large-page and in the past I have seen games like Shadow of War that have options directly related to large memory pages. So I imagine it may be useful for more games. But it is hard to notice in practice if they are benefiting or not.

A pleasant surprise was the disk readahead parameter. I assumed that I did not need to tune it because my games are installed in a fast nvme. And usually you think about readahead in HDD. But it turns out that my nvme is benefiting more from this setting than my HDD and the games load A LOT faster if I set it to something like 16 MB or 32 MB. If you find out any other system setting that may be worth investigating and that it is not in fastgame yet let me know :-).

You want me to close this or should we wait until you are comfortable with your workaround?

We can close this.

gardotd426 commented 1 year ago

A pleasant surprise was the disk readahead parameter. I assumed that I did not need to tune it because my games are installed in a fast nvme. And usually you think about readahead in HDD. But it turns out that my nvme is benefiting more from this setting than my HDD and the games load A LOT faster if I set it to something like 16 MB or 32 MB.

That's interesting. I also have all-flash storage: 1x 2TB PCIe 4.0 NVME (WD Black SN850X), 2x 1TB PCIe 3.0 NVME (TEAMGroup MP34s), 1 1TB SATA Samsung 860 QVO and 1 1TB SATA Samsung 870 QVO. Most of my games are on the NVME drives, do you adjust the 16MB/32MB based on the speed of the NVME (e.g. PCIe 3.0 vs 4.0) or do you kind of just see what performs best by trial and error?

If you find out any other system setting that may be worth investigating and that it is not in fastgame yet let me know :-).

I definitely will, I was actually surprised to find that most of the options I saw (like the PCIe APMS option) were even available in user-space. I build custom linux-tkg kernels that I try to optimize as best I can, but I'll have to read some of the kernel docs and maybe talk to TKG to see what's editable in-OS vs what needs to be configged at compile-time, or what can be configured per-process vs what is global.

I'm definitely going to start out with the PCIe APMS stuff and the disk readahead options, since I have fast storage and I have a 3090, that'd be cool if I could close the gaps with Windows on Wine/Proton games.

wwmm commented 1 year ago

That's interesting. I also have all-flash storage: 1x 2TB PCIe 4.0 NVME (WD Black SN850X), 2x 1TB PCIe 3.0 NVME (TEAMGroup MP34s), 1 1TB SATA Samsung 860 QVO and 1 1TB SATA Samsung 870 QVO.

Here I have a kingston kc3000. I have also applied to the kernel nvme module some options that unfortunately can only be done when the module is loaded

options nvme poll_queues=16 write_queues=16

do you adjust the 16MB/32MB based on the speed of the NVME (e.g. PCIe 3.0 vs 4.0) or do you kind of just see what performs best by trial and error?

It has been by trial and error. I have tested larger values like 128 MB and 256 MB but it feels like the whole thing saturates around 16 MB or 32 MB. I have only the nvme I told above available for games. So I do not know if the setting behaves differently on pci 3. So far it seems to depend more on the game access pattern. Horizon Zero Dawn and Wild Hearts seem to benefit more from the readahead than Hogwarts Legacy for example.

I definitely will, I was actually surprised to find that most of the options I saw (like the PCIe APMS option) were even available in user-space.

I decided to add the pcie aspm policy after upgrading to a RX 7900 XT and a pcie 5 motherboard. For some reason in my new setup the aspm module seemed to be trying to save power while the GPU was running games.

gardotd426 commented 1 year ago

Here I have a kingston kc3000. I have also applied to the kernel nvme module some options that unfortunately can only be done when the module is loaded

I'm a bit confused about this part, what do you mean that it "unfortunately can only be done when the module is loaded?" I mean, if you're using /etc/modprobe.d/nvme.conf to set options nvme poll_queues=16 write_queues=16, then it obviously gets applied on boot, so I'm not sure what this is referring to.

Is there anything I could do to help you implement nvidia stuff? Gamemode is broken with Proton 8.X (it's a known issue apparently), so it'd be cool to have fastgame fill in the gaps that gamemode leaves. Obviously I know how to compile software, I'm happy to test any ideas you might have, or if there's something non-coding related (still trying to learn all that) just let me know.

wwmm commented 1 year ago

I'm a bit confused about this part, what do you mean that it "unfortunately can only be done when the module is loaded?"

If you take a look at the files inside /sys/module/nvme/parameters/ you will see the module parameters. In the case of the module pcie_aspm you can write to them at runtime. But in the vast majority of the modules you can only set the parameters when the system boots. This is annoying because we can't create profiles for different games and switching between them without rebooting.

Is there anything I could do to help you implement nvidia stuff?

I think that some copy and paste of the code already in place for amdpgu can be done. By this I mean the parts that create the gtk widgets, write to the json file, etc. If you are confortable with c++ you can try to do this. If not point me to the libraries or whatever has to be used for nvidia nowadays that I can try to put something in place.

wwmm commented 1 year ago

Now I remember that when I started this project I used nvidia. When I moved to amd I was not sure if I could keep developing the code I had written for nvidia so I dropped it. Fortunately github still has the releases with it https://github.com/wwmm/fastgame/releases/tag/v0.0.9. I think that something you may try is testing if it still compiles with the recent nvidia libraries. After that it would be a matter of using the current boilerplate we have for amdgpu and put the old nvidia code in it.

gardotd426 commented 1 year ago

Ugh. Damn, I was hoping this would be rather easy. So, I checked out v0.0.9, unfortunately it fails to build but not because of anything to do with Nvidia. ninja fails at the very last object, 26/27 Compiling C++ object src/fastgame_apply.p/netlink.cpp.o. It's a bunch of netlink.cpp/netlink.hpp errors.

I was hoping that you hadn't completely reworked fastgame since then, because then I could either a) revert patches from current master back in time til the nvidia stuff existed or b) checkout v0.0.9 and patch it with later commits to get it to compile, but sure enough there's no actual way to do either because of how many ground-up fundamental reworks have been done.

BUT, I would say that considering I got zero errors during compilation related to Nvidia, that would mean it does compile. And I took a brief look at a couple of the nvidia src files and it seems like nothing you put in there has changed, as far as power mizer modes and the like. I think you can just copy and paste.

wwmm commented 1 year ago

It's a bunch of netlink.cpp/netlink.hpp errors.

Oh... Yes. I had to rework this part because of changes in gcc if I am not mistaken. So unless you try on an older compiler do not even bother trying to make it compile. I totally forgot about this...

BUT, I would say that considering I got zero errors during compilation related to Nvidia, that would mean it does compile. And I took a brief look at a couple of the nvidia src files and it seems like nothing you put in there has changed, as far as power mizer modes and the like. I think you can just copy and paste.

Ok. Good to know. In the next days I will try to find some time to bring back that code.

wwmm commented 1 year ago

In the next days I will try to find some time to bring back that code.

Done. Now it is just a matter of creating a graphical interface for nvidia.

gardotd426 commented 1 year ago

Oh... Yes. I had to rework this part because of changes in gcc if I am not mistaken. So unless you try on an older compiler do not even bother trying to make it compile. I totally forgot about this...

Yeah, I actually had already tried compiling it with my backup GCC installation before I even commented, but I don't have GCC 11 cause it's only available in the AUR (gcc12 is available in extra), so I only tried 13 and 12.

Let me know when you get the GUI for NV ready and I'll test it.

wwmm commented 1 year ago

Let me know when you get the GUI for NV ready and I'll test it.

Although I did not add the power limit control yet the powermize mode and the clock offset control are in place. Now we have to know if they work :smile:. I can´t test them on my computers.

gardotd426 commented 1 year ago

Damn, the powermizer mode doesn't work (haven't tested the clock offset yet).

I went to the CPU tab to change the performance governor to ensure that fastgame was working at all, and sure enough when I launched alacritty (the executable I applied it to just to choose an easy one), it immediately reported that ondemand was my new governor, but nvidia-settings showed it was still "Prefer Maximum Performance" (I had selected both Auto and Adaptive on separate occasions). And yes, I did make sure to kill nvidia-settings and relaunch it each time.

Here is the terminal output for the terminal window I'm running fastgame from when I actually click "Apply":

netlink: socket created
netlink: socket binding succesful!
netlink: sent PROC_CN_MCAST_LISTEN to kernel
fastgame_apply: (125939, alacritty, /usr/bin/alacritty, alacritty)
fastgame_apply: (126782, alacritty, /usr/bin/alacritty, /usr/bin/alacritty)
fastgame_apply: /usr/include/boost/signals2/detail/lwm_pthreads.hpp:60: void boost::signals2::mutex::lock(): Assertion `pthread_mutex_lock(&m_) == 0' failed.

I'll try the clock offset real quick to see if that works. ....Nope, doesn't work. Seems the Nvidia stuff is the only stuff to not apply.

wwmm commented 1 year ago

fastgame_apply: /usr/include/boost/signals2/detail/lwm_pthreads.hpp:60: void boost::signals2::mutex::lock(): Assertion `pthread_mutexlock(&m) == 0' failed.

I have never seen this happening. It has been so many years since I used an nvidia gpu I do not remember under what conditions I used this code. But it used to work. The fastgame_apply binary runs as root. Could it be that the nvidia stuff has to be executed as the same user that is using the graphical desktop session? I do not remember... But the root user getting a different display may be the cause.

If trying to apply as root is the problem moving the nvidia calls to the fastgame_launcher may be an easy solution.

gardotd426 commented 1 year ago

Could it be that the nvidia stuff has to be executed as the same user that is using the graphical desktop session? I do not remember... But the root user getting a different display may be the cause.

No, I can open nvidia-settings as root and it works fine. You can have two separate $HOME/.nvidia-settings-rc config files that load your settings in /root/ and in ~/. I have different settings in each, and there are never any errors, and whether I open nvidia-settings as root or as my regular user, it doesn't matter.

gardotd426 commented 1 year ago

Are you familiar with Python at all? GreenWithEnvy is written in Python, you could look at how it sets power limits and clock offsets and see if you can translate it to C++ (though it doesn't offer powermizer mode changing).

https://gitlab.com/leinardi/gwe

Then there's nvclock, it's older but still uses nvidia-smi or some other similar mechanism, it's just CLI-only and it's written in C, which I'm assuming is much closer to C++.

https://github.com/JungleCatSW/nvclock (this is a fork of the original but the original doesn't have a repo here that I could find easily, but it's the same project).

wwmm commented 1 year ago

I'm familiar with Python but after a quick look at gwe it is probably more straightforward to look at nvclock. The thing is both of them seem to do something similar at the start that is to use XOpenDisplay to open a display https://github.com/JungleCatSW/nvclock/blob/e4495ae73a98ee434bd840cb999889f05acd5fc1/src/nvclock.c#L792. That is something that is bothering me in your log output. There should be log lines about the success or failure of this operation in fastgame https://github.com/wwmm/fastgame/blob/70b2bfdf59a5d5ba321c5049218f912d9d637d34/src/nvidia/nvidia.cpp#L6. It is almost like that part of the code wasn't executed. Weird... If you look at the output of sudo journaltcl -b | grep -i fastgame is there any message about our attempt to open the display?

wwmm commented 1 year ago

Oh... Those lines will be printed only if fastgame is executed in debug mode G_MESSAGES_DEBUG=fastgame fastgame. It is probably better to turn the ones that are related to failures into warnings so they will be printed no matter what.

wwmm commented 1 year ago

I have updated the master branch. Some of the nvidia debug messages are now warnings. It should be easier to see now where things are failing.

gardotd426 commented 1 year ago

I'm giving it a go right now, but I'm confused about why you chose to use Hz instead of MHz for the clock offsets? Is it a typo or should I be typing 50000000 just to get a 50MHz clock offset or what, since that's how many Hz it would be.

wwmm commented 1 year ago

Is it a typo

Oops. Yes it is a typo.

gardotd426 commented 1 year ago

Lmao thank god it doesn't work then.

But yeah, nothing even really happened, and I'm confirming via nvidia-settings -q all | grep PowerMizer that the powermizer mode is not being changed. Nor is the clock offset.

Here's the output, with G_MESSAGES_DEBUG=fastgame:

(process:1920916): fastgame-DEBUG: 13:05:06.440:    fastgame.cpp:17 fastgame version: 0.2.0
(process:1920916): fastgame-DEBUG: 13:05:06.441:    fastgame.cpp:27 main: locale directory: /usr/share/locale

(fastgame:1920916): Gtk-WARNING **: 13:05:06.464: Theme parser error: gtk-dark.css:5688:3-9: No property named "height"

(fastgame:1920916): Gtk-WARNING **: 13:05:06.476: Theme directory actions/16-Dark of theme ZafiroCircle has no size field

(fastgame:1920916): Adwaita-WARNING **: 13:05:07.058: Using GtkSettings:gtk-application-prefer-dark-theme with libadwaita is unsupported. Please use AdwStyleManager:color-scheme instead.
(fastgame:1920916): fastgame-DEBUG: 13:05:07.087:   application_ui.cpp:401  application_ui: Icon Theme ZafiroCircle detected
(fastgame:1920916): fastgame-DEBUG: 13:05:07.089:   presets_menu.cpp:47 presets_menu: user presets directory already exists: /home/matt/.config/fastgame
(fastgame:1920916): fastgame-DEBUG: 13:05:07.105:   cpu.cpp:289 cpu: number of cores: 24
(fastgame:1920916): fastgame-DEBUG: 13:05:07.105:   cpu.cpp:186 cpu: The current pcie_aspm policy is: default
(fastgame:1920916): fastgame-DEBUG: 13:05:07.120:   memory.cpp:166  memory: transparent huge pages state: always
(fastgame:1920916): fastgame-DEBUG: 13:05:07.120:   memory.cpp:200  memory: transparent huge pages defrag: defer+madvise
(fastgame:1920916): fastgame-DEBUG: 13:05:07.120:   memory.cpp:234  memory: transparent huge pages state: never
(fastgame:1920916): fastgame-DEBUG: 13:05:07.139:   amdgpu.cpp:315  amdgpu: number of amdgpu cards: 0
(fastgame:1920916): fastgame-DEBUG: 13:05:07.150:   nvidia.cpp:19   Using NV-CONTROL extension 1.29 on display :0
(fastgame:1920916): fastgame-DEBUG: 13:05:07.161:   nvidia.cpp:60   [min, max] values for NV_CTRL_GPU_NVCLOCK_OFFSET: [-1000, 1000]
(fastgame:1920916): fastgame-DEBUG: 13:05:07.161:   nvidia.cpp:73   [min, max] values for NV_CTRL_GPU_MEM_TRANSFER_RATE_OFFSET: [-2000, 6000]

(fastgame:1920916): Gtk-WARNING **: 13:05:07.195: Theme directory actions/16-Dark of theme ZafiroCircle has no size field

(fastgame:1920916): fastgame-DEBUG: 13:05:47.931:   application_ui.cpp:348  application_ui: removed the file: /tmp/fastgame.json
(fastgame:1920916): fastgame-DEBUG: 13:05:51.177:   application_ui.cpp:169  application_ui: saved preset: /tmp/fastgame.json
netlink: socket created
netlink: socket binding succesful!
netlink: sent PROC_CN_MCAST_LISTEN to kernel

That's the end of the output, even after running fastgame_launcher alacritty (alacritty is the executable I chose in fastgame).

Here's the nvidia part of my fastgame.json (and also, I put in a goofy envar in the Environment Variables section to confirm that the other settings are applying, they are):

    "nvidia": {
        "powermize-mode": "0",
        "clock-offset": {
            "gpu": "60",
            "memory": "0"
        }

And yet, my powermizer mode (queried from the very alacritty window I launched with fastgame_launcher) is still 1, not zero, and my offset is 45, which is my normal offset, not 60, like I set it in fastgame.

wwmm commented 1 year ago

nvidia.cpp:19 Using NV-CONTROL extension 1.29 on display :0 nvidia.cpp:60 [min, max] values for NV_CTRL_GPU_NVCLOCK_OFFSET: [-1000, 1000] nvidia.cpp:73 [min, max] values for NV_CTRL_GPU_MEM_TRANSFER_RATE_OFFSET: [-2000, 6000]

So we are able to initialize the nv-control extension and query the range of the overclock offsets but we can't change the settings... That is odd...

wwmm commented 1 year ago

I have found a very dumb mistake in the file that is supposed to apply the setting. I forgot to call the constructor of the smart pointer that wrappers the nvidia class :smile:. Master branch updated again. Maybe now it will work.

gardotd426 commented 1 year ago

fastgame_apply: /usr/include/boost/signals2/detail/lwm_pthreads.hpp:60: void boost::signals2::mutex::lock(): Assertion pthread_mutex_lock(&m_) == 0' failed.

Here's the full debug output:

(process:2048508): fastgame-DEBUG: 13:44:40.271:    fastgame.cpp:17 fastgame version: 0.2.0
(process:2048508): fastgame-DEBUG: 13:44:40.272:    fastgame.cpp:27 main: locale directory: /usr/share/locale

(fastgame:2048508): Gtk-WARNING **: 13:44:40.299: Theme parser error: gtk-dark.css:5688:3-9: No property named "height"

(fastgame:2048508): Gtk-WARNING **: 13:44:40.311: Theme directory actions/16-Dark of theme ZafiroCircle has no size field

(fastgame:2048508): Adwaita-WARNING **: 13:44:40.897: Using GtkSettings:gtk-application-prefer-dark-theme with libadwaita is unsupported. Please use AdwStyleManager:color-scheme instead.
(fastgame:2048508): fastgame-DEBUG: 13:44:40.925:   application_ui.cpp:401  application_ui: Icon Theme ZafiroCircle detected
(fastgame:2048508): fastgame-DEBUG: 13:44:40.926:   presets_menu.cpp:47 presets_menu: user presets directory already exists: /home/matt/.config/fastgame
(fastgame:2048508): fastgame-DEBUG: 13:44:40.944:   cpu.cpp:289 cpu: number of cores: 24
(fastgame:2048508): fastgame-DEBUG: 13:44:40.945:   cpu.cpp:186 cpu: The current pcie_aspm policy is: default
(fastgame:2048508): fastgame-DEBUG: 13:44:40.958:   memory.cpp:166  memory: transparent huge pages state: always
(fastgame:2048508): fastgame-DEBUG: 13:44:40.958:   memory.cpp:200  memory: transparent huge pages defrag: defer+madvise
(fastgame:2048508): fastgame-DEBUG: 13:44:40.958:   memory.cpp:234  memory: transparent huge pages state: never
(fastgame:2048508): fastgame-DEBUG: 13:44:40.982:   amdgpu.cpp:315  amdgpu: number of amdgpu cards: 0
(fastgame:2048508): fastgame-DEBUG: 13:44:40.987:   nvidia.cpp:19   Using NV-CONTROL extension 1.29 on display :0
(fastgame:2048508): fastgame-DEBUG: 13:44:41.004:   nvidia.cpp:60   [min, max] values for NV_CTRL_GPU_NVCLOCK_OFFSET: [-1000, 1000]
(fastgame:2048508): fastgame-DEBUG: 13:44:41.004:   nvidia.cpp:73   [min, max] values for NV_CTRL_GPU_MEM_TRANSFER_RATE_OFFSET: [-2000, 6000]

(fastgame:2048508): Gtk-WARNING **: 13:44:41.050: Theme directory actions/16-Dark of theme ZafiroCircle has no size field

(fastgame:2048508): fastgame-DEBUG: 13:44:59.845:   application_ui.cpp:348  application_ui: removed the file: /tmp/fastgame.json
(fastgame:2048508): fastgame-DEBUG: 13:45:03.177:   application_ui.cpp:169  application_ui: saved preset: /tmp/fastgame.json
fastgame_apply: /usr/include/boost/signals2/detail/lwm_pthreads.hpp:60: void boost::signals2::mutex::lock(): Assertion `pthread_mutex_lock(&m_) == 0' failed.
(fastgame:2048508): fastgame-DEBUG: 13:45:14.178:   application_ui.cpp:169  application_ui: saved preset: /tmp/fastgame.json
netlink: socket created
netlink: socket binding succesful!
netlink: sent PROC_CN_MCAST_LISTEN to kernel
fastgame_apply: (2048787, alacritty, /usr/bin/alacritty, alacritty)
gardotd426 commented 1 year ago

Wait, that's a red herring, my bad.

That's an error from boost, which isn't even needed. I just uninstalled it so it wouldn't throw any more errors, still no change, but now no output.

(process:2051188): fastgame-DEBUG: 13:50:51.550:    fastgame.cpp:17 fastgame version: 0.2.0
(process:2051188): fastgame-DEBUG: 13:50:51.551:    fastgame.cpp:27 main: locale directory: /usr/share/locale

(fastgame:2051188): Gtk-WARNING **: 13:50:51.576: Theme parser error: gtk-dark.css:5688:3-9: No property named "height"

(fastgame:2051188): Gtk-WARNING **: 13:50:51.589: Theme directory actions/16-Dark of theme ZafiroCircle has no size field

(fastgame:2051188): Adwaita-WARNING **: 13:50:52.174: Using GtkSettings:gtk-application-prefer-dark-theme with libadwaita is unsupported. Please use AdwStyleManager:color-scheme instead.
(fastgame:2051188): fastgame-DEBUG: 13:50:52.203:   application_ui.cpp:401  application_ui: Icon Theme ZafiroCircle detected
(fastgame:2051188): fastgame-DEBUG: 13:50:52.208:   presets_menu.cpp:47 presets_menu: user presets directory already exists: /home/matt/.config/fastgame
(fastgame:2051188): fastgame-DEBUG: 13:50:52.226:   cpu.cpp:289 cpu: number of cores: 24
(fastgame:2051188): fastgame-DEBUG: 13:50:52.226:   cpu.cpp:186 cpu: The current pcie_aspm policy is: default
(fastgame:2051188): fastgame-DEBUG: 13:50:52.244:   memory.cpp:166  memory: transparent huge pages state: always
(fastgame:2051188): fastgame-DEBUG: 13:50:52.244:   memory.cpp:200  memory: transparent huge pages defrag: defer+madvise
(fastgame:2051188): fastgame-DEBUG: 13:50:52.244:   memory.cpp:234  memory: transparent huge pages state: never
(fastgame:2051188): fastgame-DEBUG: 13:50:52.265:   amdgpu.cpp:315  amdgpu: number of amdgpu cards: 0
(fastgame:2051188): fastgame-DEBUG: 13:50:52.295:   nvidia.cpp:19   Using NV-CONTROL extension 1.29 on display :0
(fastgame:2051188): fastgame-DEBUG: 13:50:52.295:   nvidia.cpp:60   [min, max] values for NV_CTRL_GPU_NVCLOCK_OFFSET: [-1000, 1000]
(fastgame:2051188): fastgame-DEBUG: 13:50:52.295:   nvidia.cpp:73   [min, max] values for NV_CTRL_GPU_MEM_TRANSFER_RATE_OFFSET: [-2000, 6000]

(fastgame:2051188): Gtk-WARNING **: 13:50:52.344: Theme directory actions/16-Dark of theme ZafiroCircle has no size field

(fastgame:2051188): fastgame-DEBUG: 13:51:11.111:   application_ui.cpp:348  application_ui: removed the file: /tmp/fastgame.json
(fastgame:2051188): fastgame-DEBUG: 13:51:14.178:   application_ui.cpp:169  application_ui: saved preset: /tmp/fastgame.json
netlink: socket created
netlink: socket binding succesful!
netlink: sent PROC_CN_MCAST_LISTEN to kernel
fastgame_apply: (2051374, alacritty, /usr/bin/alacritty, /usr/bin/alacritty)
fastgame_apply: (2052263, alacritty, /usr/bin/alacritty, alacritty)
wwmm commented 1 year ago

/usr/include/boost/signals2/detail/lwm_pthreads.hpp:60: void boost::signals2::mutex::lock(): Assertion `pthread_mutexlock(&m) == 0' failed.

What is bizarre in this error is that I do not even try to deal with Boost signals or mutex. I will change the code moving the function that tries to apply the setting to the section that does not run as root just to see what happens. Maybe the way that boost::process::child and pkexec interact does not play nice with these nvidia calls.

wwmm commented 1 year ago

Wait, that's a red herring, my bad.

No problems.

wwmm commented 1 year ago

I have found another mistake at https://github.com/wwmm/fastgame/blob/805863305f17118abece5ebe334625b8012bd3ee/src/fastgame_apply.cpp#L157 that at the very least should be causing a compilation error but it isn't. I am starting to suspect that something is wrong in the Meson files and this function isn't even being put in the executable.

wwmm commented 1 year ago

@gardotd426 I think that at least some error messages should be visible now. I forgot to include a header generated by Meson at configuration time and the compiler was completely ignoring the function that was supposed to apply the nvidia settings. Even if it does not work yet we should at least see something new in the logs.