BartoszCichecki / LenovoLegionToolkit

Lightweight Lenovo Vantage and Hotkeys replacement for Lenovo Legion laptops.
GNU General Public License v3.0
5.54k stars 249 forks source link

GPU is not waking up unless dGPU mode is selected #876

Closed elitonfilho closed 1 year ago

elitonfilho commented 1 year ago

Rules

Version

2.15.1

OS

Windows 11 22621.1848

Device

Legion 5 Pro 16IAH7H

BIOS version

J2CN49WW

What's wrong?

GPU is not waking up after changing the GPU Working mode. It is activated only when selecting dGPU mode and restarting. Already tried switching between Hybrid options and then restart, but without success.

When switching from Hybrid to Hybrid-iGPU, there's a yellow message bar saying 'dGPU is currently in use ' When switching from Hybrid to Hybrid-auto, there's a yellow message bar saying 'dGPU is currently in use or laptop is not on battery power ' Similar to #689 but with different logs

How to reproduce the bug?

No necessary steps to reproduce it

What is the behavior that you expected?

dGPU should be active at least on Hybrid mode

Logs

log_2023_07_12_21_56_25.txt

Additional information

No response

BartoszCichecki commented 1 year ago

When switching from Hybrid to Hybrid-iGPU, there's a yellow message bar saying 'dGPU is currently in use ' When switching from Hybrid to Hybrid-auto, there's a yellow message bar saying 'dGPU is currently in use or laptop is not on battery power '

These messages indicate that EC rejected mode change. Why? That's a question to Lenovo. Maybe there is a screen connected, maybe you are switching too fast or there is other reasons.

I cannot reproduce any of the issues and logs don't show anything out of order. I can without problem switch to iGPU and back to Hybrid. I can also see that the EC is not notifying that dGPU becomes available.

No necessary steps to reproduce it

I would say there steps are needed to reproduce it, because I can't make it happen.

BartoszCichecki commented 1 year ago

You can test these 3 builds, and see if there is any difference, but I doubt it.

LenovoLegionToolkitSetup_2.zip LenovoLegionToolkitSetup_3.zip LenovoLegionToolkitSetup_1.zip

ElementSpica commented 1 year ago

I had the same problem as you, and I managed to solve it.

The summary of what I did to solve it:

I uninstalled Lenovo Legion Toolkit, and reinstalled it again using the latest version after a restart, and then putting 3 second delays for each "Actions" customized in the "Actions" Tab, so it doesn't change everything too quickly.


Exact steps on how I solved the problem (some of which may or may not help you or is necessary for you).

  1. Uninstalling Lenovo Legion Toolkit
  2. Restarting and using Lenovo Vantage and using the Hardware Scan (I customized it and checked all the selections), as well as check some updates (there are none).
  3. I left Vantage's GPU working mode to Hybrid-Auto, making sure to avoid Hybrid-iGPU just in case it carries over its status once I installed Toolkit and have it disable Vantage.
  4. After PC restart, I installed fresh copy of Toolkit latest version, went on to put my old Action lists for both plugged and unplugged AC adapter.
  5. I made sure not to put "Deactivate GPU" in the plugged AC adapter Action sequence. I also put 3-second "Delay" in between each of the Actions.
  6. Making sure to set Toolkit to Hybrid-Auto along with the Power Plans that I have, I managed to get it to work properly this time when running a game and simulating both scenarios of connected and disconnected AC adapter; my dGPU auto-activates now.

My situation for your reference which may or may not be exactly the same as yours:

-Nvidia video card driver version: 536.40

-Laptop model: Legion Pro 5i 16IRX8

-Toolkit and Nvidia dGPU automatically switches when running a game specified to use it in Windows Graphics Settings. It is working properly until a day ago (about the same time as your report here).

-Toolkit GPU-Working Mode: Hybrid-iGPU

-One day after opening my laptop and running a game, I noticed it was stuttering; the Nvidia dGPU is not activated automatically as it used to.

-I also tried connecting and disconnecting the AC adapter but my dGPU remains asleep in both scenarios

-I switched to Hybrid-Auto, restarted, ran a game again and still nothing. I noticed there is no "Discrete GPU" and "Overclock GPU" options in Toolkit's Dashboard, which it used to have if I set it to Hybrid-Auto.

-Along with that, Nvidia "Manage Display Mode", "Nvidia GPU Activity" and "Nvidia Settings" in the Tray of my taskbar are also gone; I couldn't switch to dGPU through it, Device Manager also couldn't detect my GPU unless I select "Show Hidden Devices".

-I can't remember if I put "Deactivate GPU" in the Actions tab of Toolkit in the "When AC power adapted is connected".

-If I select dGPU in Toolkit's GPU Working Mode, my GPU runs fine as it should. Meaning my GPU should be ok and is not physically damaged or anything.

-Uninstalling Toolkit, the automatic switching of iGPU and dGPU when I run and quit a game is working on Lenovo Vantage, as long as I am on Hybrid-Auto.

-Hybrid iGPU seems to be too aggressive and causes my dGPU to not wake up at all even if an app that I specified in the Windows Graphics Settings to run it is currently running, so I avoided it this time (my dGPU used to wake up if I run a game even in this mode, but not anymore).


A new problem:

-I tried to modify and optimize the Global Settings of my Nvidia card, restarted and it doesn't automatically switch from iGPU to dGPU when running a game again.

-I tried to restore all to default settings and it worked (i.e. switch automatically from iGPU to dGPU through the "Automatic Select" function) on just one game: Genshin Impact. But when I tried Honkai Impact and other older games, it doesn't switch anymore.

-However, the Nvidia-related icons in my Taskbar Tray and dGPU-related sections in the Toolkit dashbard is there and is working fine. If I run a game like Honkai Impact 3rd, even if it doesn't switch to the dGPU from the "Manage Display Mode" of Nvidia, the dGPU still displays as "On" in the Toolkit and still shows as being "active" in the "Nvidia GPU Activity" from my Tray.

-Changing toolkit's GPU Working Mode to Hybrid-iGPU, the old problems persist, but switching to Hybrid-Auto or Hybrid restores the automatic switching function (along with the visibility of Nvidia-related icons in the Taskbar Tray).

I conclude for now that Nvidia video card driver version: 536.40 might have some contribution to this problem (even if it was working fine just a few days ago), and I am going to try and use Device Driver Uninstaller and install the recommended Drivers as taught in the "Lenovo Series" Discord server's "Knowledge Base" and see if it works out this time.

If you have the very latest Nvidia video card driver too, try uninstalling it and downgrading it for now to the recommended Version of around 531.68.

BartoszCichecki commented 1 year ago

@ElementSpica, good investigation, probably with some extra unnecessary steps.

First a word of comment about what you wrote:

-Toolkit GPU-Working Mode: Hybrid-iGPU

-One day after opening my laptop and running a game, I noticed it was stuttering; the Nvidia dGPU is not activated automatically as it used to.

That seem natural. In Hybrid iGPU-only mode, if switched properly, dGPU is not connected to the system so it won't magically turn on when you start a game. Settings in nVidia control panel are also completely separate from the GPU Working Mode feature.

How the GPU Working Mode works, does not seem to be related nVidia driver version, as they are not a driver features, but rather hardware features built by Lenovo. Specific driver version might have an impact on how it works, but so far it has not been proven.

The issue effectively boils down to a relatively "simple" process. Either Vantage, LegionZone or LLT requests a mode change from the EC. EC either accepts it or rejects it. Below you can find relevant part of ACPI code extraced from K1CN40WW BIOS, however other bioses are pretty much exactly the same. As you can see there are quite a few moving pieces in there. Simply speaking, whenever you see Return (One) in the code below, request to switch between hybrid modes was successful, while Return (Zero) means rejected.

I have observed that to have good experience with GPU Working Modes, you can't switch them too quickly and frankly, too often. If you switch to Hybrid iGPU-only, dGPU will not disconnected when it's in use and if that happens, I make sure that no processes are running on it and no external screen is connected, and I try again. Later, when I want to come back to using Hybrid mode, it sometimes takes couple seconds before Windows notices that dGPU is back online.

From the software point of view, it doesn't matter if you use Vantage or LegionZone or LLT - they all use the same methods of controlling this. I can make dGPU "stuck" where nothing can bring dGPU back without a restart - that happens after 20-30 very quick switches between Hybrid and Hybrid iGPU-only. If that happens, it's enough that I set mode to Hybrid and restart my laptop and all goes back to normal.

It might be that there are better and worse implementations of the code below in different BIOS versions, but it's impossible to tell which one is good or bad without any documentation. Regardless of that, OS-exposed ways of controlling this process are limited. LLT implements both the "LegionZone way" (by default) and the "Vantage way" (with the optional legacy argument, check readme for details).

While not ideal, there is only so many things I can do. Lenovo definitely should improve this process in the firmware, by for example returning more miningful status codes in case something goes wrong. It should also expose a way of checking if EC is ready yet for next mode change.

I updated the readme a little bit and next version will include slightly updated messages and an artifical 5 second timeout for changing these modes, to mitigate monkey-clicking and making the situation even worse.

I am closing this issue as there is no issue in LLT code.

Relevant code from ACPI. ``` Method (HTPL, 0, NotSerialized) { If (((DBFL & One) == One)) { ADBG (Concatenate ("IGPM= ", ToHexString (IGPM))) ADBG (Concatenate ("GSTA= ", ToHexString (GSTA))) ADBG (Concatenate ("CUMA= ", ToHexString (^^PC00.PEG1.PEGP.CUMA))) ADBG (Concatenate ("M239= ", ToHexString (^^PC00.PEG1.PXP.M239))) ADBG (Concatenate ("PSW2= ", ToHexString (GGIV (PSW2)))) } If ((GGIV (HPD1) == Zero)) { If (((DBFL & One) == One)) { ADBG (Concatenate ("HTPL HPD1= ", ToHexString (GGIV (HPD1)))) } CAGS (HPD1) SHPO (HPD1, Zero) } If ((GGIV (HPD2) == Zero)) { If (((DBFL & One) == One)) { ADBG (Concatenate ("HTPL HPD2= ", ToHexString (GGIV (HPD2)))) } CAGS (HPD2) SHPO (HPD2, Zero) } If ((GGIV (HPD3) == Zero)) { If (((DBFL & One) == One)) { ADBG (Concatenate ("HTPL HPD3= ", ToHexString (GGIV (HPD3)))) } CAGS (HPD3) SHPO (HPD3, Zero) } If ((GGIV (HPD4) == Zero)) { If (((DBFL & One) == One)) { ADBG (Concatenate ("HTPL HPD4= ", ToHexString (GGIV (HPD4)))) } CAGS (HPD4) SHPO (HPD4, Zero) } If ((GSTA == One)) { If ((((GGIV (HPD3) == One) || (GGIV (HPD4) == One)) || ( (GGIV (HPD1) == One) || (GGIV (HPD2) == One)))) { If ((^^PC00.PEG1.PXP.M239 == Zero)) { Notify (^^PC00.PEG1.PEGP, 0xC0) // Hardware-Specific } Return (Zero) } If ((TBTP < 0x03)) { If (((IGPM == One) && (^^PC00.PEG1.PXP.M239 == Zero))) { If (((DBFL & One) == One)) { ADBG (Concatenate ("Notify plug out , IGPM= ", ToHexString (IGPM))) ADBG (Concatenate ("Notify plug out , PLIT= ", ToHexString (PLIT))) } ^^PC00.PEG1.PEGP.CUMA = One Notify (^^PC00.PEG1.PEGP, 0x03) // Eject Request Return (One) } If ((((IGPM == 0x02) && (^^PC00.PEG1.PXP.M239 == Zero)) && (^^PC00.LPCB.EC0.RPWR == Zero))) { ^^PC00.PEG1.PEGP.CUMA = One Notify (^^PC00.PEG1.PEGP, 0x03) // Eject Request Return (One) } TBTP += One If ((TBTP >= 0x03)) { TBTP = 0x03 } ADBG (Concatenate ("Notify plug out , TBTP= ", ToHexString (TBTP))) } Else { If (((IGPM == One) && (^^PC00.LPCB.EC0.GPUW <= 0x1A))) { If (((DBFL & One) == One)) { ADBG (Concatenate ("Notify plug out 6, IGPM= ", ToHexString (IGPM))) ADBG (Concatenate ("Notify plug out 6, PLIT= ", ToHexString (PLIT))) } ^^PC00.PEG1.PEGP.CUMA = One Notify (^^PC00.PEG1.PEGP, 0x03) // Eject Request Return (One) } If ((((IGPM == 0x02) && (^^PC00.LPCB.EC0.GPUW <= 0x1A)) && (^^PC00.LPCB.EC0.RPWR == Zero))) { ^^PC00.PEG1.PEGP.CUMA = One Notify (^^PC00.PEG1.PEGP, 0x03) // Eject Request Return (One) } } Return (Zero) } ElseIf ((GSTA == Zero)) { If ((((GGIV (HPD3) == One) || (GGIV (HPD4) == One)) || ( (GGIV (HPD1) == One) || (GGIV (HPD2) == One)))) { ADBG (Concatenate ("Notify plug in HPD, PLIT= ", ToHexString (PLIT))) If ((IGPM == One)) { If ((PLIT == Zero)) { Notify (^^PC00.PEG1.PEGP, Zero) // Bus Check } } } If (((IGPM == Zero) && (^^PC00.PEG1.PXP.M239 == Zero))) { If ((PLIT == Zero)) { If (((DBFL & One) == One)) { ADBG (Concatenate ("Notify plug in , PLIT= ", ToHexString (PLIT))) } Notify (^^PC00.PEG1.PEGP, Zero) // Bus Check } ^^PC00.PEG1.PEGP.CUMA = Zero Return (One) } If ((((IGPM == 0x02) && (^^PC00.PEG1.PXP.M239 == Zero)) && (^^PC00.LPCB.EC0.MPWR == One))) { If ((PLIT == Zero)) { ADBG (Concatenate ("Notify plug in , PLIT= ", ToHexString (PLIT))) If (((DBFL & One) == One)) { ADBG (Concatenate ("Notify plug in , PLIT= ", ToHexString (PLIT))) } Notify (^^PC00.PEG1.PEGP, Zero) // Bus Check } ^^PC00.PEG1.PEGP.CUMA = Zero Return (One) } Return (Zero) } Else { Return (Zero) } } ```
ElementSpica commented 1 year ago

@BartoszCichecki Thank you for the response and clarifications. And indeed, switching too fast seems to have caused the issue I experienced.

Aside from me realizing I should try and put 3 second delays for each action on the AC Adapter plugged and unplugged scenarios and avoiding Hybrid-iGPU altogether and just sticking to Hybrid-Auto, I also realized today that putting "Deactivate GPU" in the AC Adapter unplugged is no longer necessary.

I realized that after simulating a scenario of a sudden power outage; if I turn off the laptop while it is in the AC unplugged scenario from Toolkit, the next time I open the laptop (without the AC adapter) and plugged in the AC adapter, the dGPU still becomes non-visible, the system couldn't detect it.

To remedy that, I had to turn off Toolkit and prevent it from autorunning and auto-executing the Actions again, and I tried to be more gentle (e.g. doing computer restarts after every change) with the sudden changes between those two scenarios; after I confirmed that the Actions on AC plugged and unplugged scenarios are doing well, I reverted it back to autorunning and automatically executing actions, and now it's working as intended.

As for the Deactivate GPU in the AC unplugged scenario, I realized that if I am currently running a game and a power outage occurs and Toolkit restarts the GPU, it might mess up my system, so there's that. It's in Hybrid-Auto mode anyway so dGPU should eventually deactivate once I close the app that's using it, it seems much safer that way.

In any case, all seems to be working well on my end now in terms of Toolkit, and I know what to do and what not to do with it now.

The only issue I have is Nvidia and/or Windows 11 seems to be not allowing me to automatically change my Display through Advanced Optimus if I specify a program like Krita or GIMP and other games in the Windows Graphics Settings to run at "High Performance"; however, it does this on Genshin Impact, I don't know why only that game/program does it still, when in before it does that on my other games, even GIMP.

It seems to be an issue on Nvidia now, and not really with Toolkit as I have initially suspected. I reverted back to an older Nvidia driver version as suggested in Discord, hopefully it becomes fixed. That or I refresh my Windows 11.

Anyway, lastly, thank you for making such an amazing software!

elitonfilho commented 1 year ago

Thanks for the replies ! @BartoszCichecki The toolkit is running on default configs and i didn't update the nvidia driver since a few weeks (531.18 here), so there are indeed no extra steps needed to reproduce this error (everything worked fine on version 2.14, bug appeared after upgrading to 2.15). Btw thanks for the clarification on how the gpu switching works

@ElementSpica Thanks for the tips! i also noticed that i would need to use Lenovo Vantage to somehow re-activate the hybrid mode , i'll try to reproduce your steps

BartoszCichecki commented 1 year ago

@elitonfilho @ElementSpica

I just released 2.15.3 that aims to improve the situation. I added a few additional fallbacks that should help EC in case it get's confused. Let me know if that helps, either here or start a Discussion.

ElementSpica commented 1 year ago

@BartoszCichecki Hello, I received the update just now.

It did make the dGPU get kind of "confused" after the update at first, such as with the refresh rate and dGPU detection problem.

But a quick Toolkit autorun disabling, restarting, re-running Vantage, restarting and running Toolkit again returned my system's and Toolkit's functions back as it should.

Thank you for the update that can help Toolkit in case a confusion arises, hope it helps the others too having similar problems.

As of now, my issues are largely solved and I don't really have a problem with Toolkit anymore, especially since I seem to be getting the hang of using it too. So far, so good.

(The only issue left is resolving this mystery of Nvidia's Advanced Optimus only running on Genshin Impact and not on other games/apps as it used to, hopefully in a future driver update, for now I'll just manually switch Display Modes when gaming or not.)

BartoszCichecki commented 1 year ago

@ElementSpica thanks for feedback. Could you also give this one a quick spin? LenovoLegionToolkitSetup.zip

ElementSpica commented 1 year ago

@BartoszCichecki It's good, the installer worked fine.