irusanov / SMUDebugTool

A dedicated tool to help write/read various parameters of Ryzen-based systems, such as manual overclock, SMU, PCI, CPUID, MSR and Power Table.
GNU General Public License v3.0
59 stars 5 forks source link

windows build #6

Open trotol opened 3 years ago

trotol commented 3 years ago

Hot to compile recent 1.3.0.0 win-build?

irusanov commented 3 years ago

You need to checkout the project (or download the source zip), then open the project solution (.sln file) in Visual Studio. I use the free community 2019 edition, but keep in mind installation "takes forever". You need to install it and configure it for use with C#. After installing the IDE, it is as simple as clicking on Build -> Build Solution.

https://visualstudio.microsoft.com/downloads/

If you need Vermeer support, I still need to fix it (sync with ZenTimings). If you don't want to go through the whole process to compile it yourself, let me know and I can attach a compiled binary for you. Will update the code today and might even release the 1.3.0 version.

roman-orekhov commented 2 years ago

@irusanov Can you please make a release with Vermeer support, at least as it is now, w/o sync to ZT?

irusanov commented 2 years ago

@roman-orekhov I can probably sync and release it after work today. I update it for myself, but haven't bothered to upload a new build.

irusanov commented 2 years ago

@roman-orekhov Build is up, contains all the latest changes. https://github.com/irusanov/SMUDebugTool/releases/tag/v1.3.0-beta Let me know if something doesn't work as expected.

PJVol commented 2 years ago

@irusanov Hi, Ivan! You know what, almost a week I've been searchin for the way to sniff SMN ports used in a Veii/asus tool, and just realized answer was staring at me from the desktop all alone. Thanks for the usefull monitoring feature. Just about to finish compiling the MP1 command ID list for Vermeer (the better part).

irusanov commented 2 years ago

@PJVol That feature isn't perfect, since it's only reading the SMU addresses at a given interval in an attempt to detect a change, but the readings might be out of sync, especially when an application is spamming the mailbox several times per second as the Asus (work) tool does. However, it helped me pick some of the commands (e.g. GetDramBaseAddress and TransferTableToSmu) from hwinfo for CPUs I had no info before.

Asus tool only spams the SMU with 0x5 and 0x6 commands when you open the PM monitor. The rest of the commands used in CPU -> AMD can be picked quite easy by trying them one by one. MP1 commands labels also contain the ID, so no need to try anything for them.

I should mention the debug tool wasn't really meant for the public, so use it at your own risk. The code isn't really good, it's a mess, but the app is still somewhat useful.

PJVol commented 2 years ago

The rest of the commands used in CPU -> AMD can be picked quite easy by trying them one by one. MP1 commands labels also contain the ID, so no need to try anything for them.

Yeah, but understanding the bit fields for some command is not that easy. I figured out arg bit layout for the setting psm margins per core dLDO. If you interested I've posted my findings here https://www.overclock.net/threads/official-amd-ryzen-ddr4-24-7-memory-stability-thread.1628751/post-28916314 I've installed VC (it took ~ 5-10 min) and built your 1.3 beta successfully. In general all is fine, except some weird errors at the beginning regarding "embedded" or whatever attributes for the resource files. I had to change filesystem attributes for the whole folder and upgrade project to NET 4.8, since 4.0 is not supported anymore. And btw inpoutx dll binary was not copied into the bin folder duiring the "make install" for some reason.

irusanov commented 2 years ago

Yeah, but understanding the bit fields for some command is not that easy. I figured out arg bit layout for the setting psm margins per core dLDO. If you interested I've posted my findings here https://www.overclock.net/threads/official-amd-ryzen-ddr4-24-7-memory-stability-thread.1628751/post-28916314

Thanks for the info, that would help me, since I'm working on a new ZenStates and want to support these commands. There's a way to get the cores enabled/disabled map, but I don't understand why it is wrong for the second CCD. Setting OC frequency to a single core works the same way - you have to specify the physical core index and the CCD index. Easiest way to get the core map is to read power of each core from the PM table, however, this means I have to define the offset and maximum cores for each table version (in the Core DLL).

And btw inpoutx dll binary was not copied into the bin folder duiring the "make install" for some reason.

Yes, it's not added as a resource. I will have to add the sources and the prebuild binaries. It's a manual copy at the moment. WinRing is now integrated in the DLL though (which I still need to add the sources, to be entirely compliant).

PJVol commented 2 years ago

There's a way to get the cores enabled/disabled map, but I don't understand why it is wrong for the second CCD

Do you get fused topology data from SMN:: 0x3008(3208)xxxx ? I just checked, the data is correct for my single CCD sample, so i'm curious what exactly is wrong with a 2nd CCD data. And btw, have you got an idea, what CCA is refered to in the pm_table?

PJVol commented 2 years ago

@irusanov

Ok. I think I now see what may be wrong with the code where you obtaining the fuses data in Zenstates-core. Actually I don't see where you even reading 2nd CCD data.

irusanov commented 2 years ago

@PJVol How so? Reading the address on my 3900X gives me 0x0000001110001000, where 1 means "disabled" core. Or if we divide them into CCDs -> [00000011] [10001000] This means, if I read it right, 15 - 14 - 13 - 12 - 11 - 10 - [9] - [8] - [7] - 6 - 5 - 4 - [3] - 2 - 1 - 0 while power table shows correct data - disabled cores 3, 7, 8, 14 image

Maybe I should read the same address + 0x200000 for the second CCD which gives me 00000011 01000001 and that seems to be about right (lowest 8 bits). So it seems I've overlooked this and the rest of the bits are something else.

As for CCA, I don't have a definition. It's something about throttling and temperature related, I think. Don't know what the abbreviation means.

irusanov commented 2 years ago

@PJVol Thanks for the heads up, should be now fixed in core DLL.

roman-orekhov commented 2 years ago

@irusanov If you disable one of the CCDs in BIOS, what would that higher byte read? Still 0b00000011 for both?

irusanov commented 2 years ago

@irusanov If you disable one of the CCDs in BIOS, what would that higher byte read? Still 0b00000011 for both?

Thought about that and will have to check, it might show the enabled CCDs. Too much work lately on my regular job and don't really have time for other projects :/

PS: Tried and they read the same. It makes sense, since these are fuses after all.

PJVol commented 2 years ago

@irusanov Maybe I should read the same address + 0x200000 for the second CCD which gives me

Yep. The original algorithm was not right. But don't forget to check disabled CCD, since in a 5600/5800 2ccd downbins usually the 1st is disabled.

@roman-orekhov Still 0b00000011 for both?

Its lowest bit is SMT status. Got no idea, though, what 2nd means.

patrickschur commented 2 years ago

Don't know what the abbreviation means.

@PJVol: CCA = CCX CAC accumulator :wink:

roman-orekhov commented 2 years ago

CCA = CCX CAC accumulator

Cool! Only raises more questions though :) What is CAC? What would mean CCA_THRESHOLD, CCA_ACTIVATION and especially CCA_CAC for L3 part of pmtable? One of the frequency limiters for me is directly correlated to L3_CCA_CAC and L3_FIT and I struggle to get my head around what it means so that I can overcome the frequency limit

patrickschur commented 2 years ago

@roman-orekhov Unfortunately I don't know what the abbreviation stands for... but CAC weights are used for power calculation.

Here are two older slides I found about this topic: Energy Efficient High Performance Computing Working Group (Slide 6) Energystar (Slide 5)

irusanov commented 2 years ago

@roman-orekhov, @patrickschur

It's been referenced as "cac counters" in uProf UserGuide and IOMMU specification: AMDuProf User Guide (page 53) AMD I/O Virtualization Technology (IOMMU) Specification, 48882

The IOMMU document has an explanation for the CAC bit on page 235 CAC: Counter source architectural or custom. RW. Reset 0. Selects architectural counter input group (Table 72) or custom input group. 0 = architectural counters as defined in Table 72. 1 = implementation-defined counters.

roman-orekhov commented 2 years ago

@irusanov

c:\Program Files\!sys\AMDuProf3.4\bin>AMDuProfCLI.exe timechart -e cac -d 60 path\to\exe
ERROR: Could not enable the counters.

And that timechart --list doesn't show cac either :( (version 3.4 is the last to have that "cac counters" string in its guide). It seems the term is being obsoleted/hidden by AMD

timedrapery commented 2 years ago

And btw, have you got an idea, what CCA is refered to in the pm_table?

@PJVol

I've seen "CCA" expanded to "Coherent Cache Architecture" in various documents

patrickschur commented 2 years ago

I actually found the meaning of CAC in a recent article from Hardwareluxx (German). CAC stands for capacitance.

From the article:

To achieve an IPC increase of 19 % and a frequency increase of 6 %, AMD had to increase the CCX effective switched capacitance (CAC) by 15%.

timedrapery commented 2 years ago

I actually found the meaning of CAC in a recent article from Hardwareluxx (German). CAC stands for capacitance.

From the article:

To achieve an IPC increase of 19 % and a frequency increase of 6 %, AMD had to increase the CCX effective switched capacitance (CAC) by 15%.

Righteous @patrickschur! That's some good stuff

I see that @1usmus HYDRA tool now reports CAC as a percentage amount in real-time and I often wonder how it's calculated and what it's "used for" by the CPU "management"

PJVol commented 2 years ago

@PJVol: CCA = CCX CAC accumulator wink

@patrickschur Thanks! Being an accumulator it makes more sence, but something still doesn't add up, sorry.

Cac is indeed the switching capacitance of a certain IP/functional block, which, along with the frequency, drives its dynamic power. But afaik Cac itself is a value derived from process node characteristics and architecture implementation specifics of that block, and is not supposed to change (in short term at least). So, what then is a point of accumulating it's values? Don't you think there's no logic behind? ;)

I have two possible explanations for the "CAC" metric so far:

  1. If it actually designates not the Cac itself, but Ceff, which is switching activity capacitance per cycle, accumulated over a time threshold on per IP basis. But that would mean there's a additional to EDC throttling functionality exists, based on Ceff monitoring.

  2. If it were the designation of the "Activity Currents" accumulator, then this, for example, would perfectly fit the AMD's own description of the EDC throttling manager ( US 2019/0146567 A1) - FIG.4 ;), where the Activity Current accumulator provides combined Activity Currents values for the subsequent averaging and comparing to the threshold, upon which it may activate clock and/or power gating for that block.

PS: I'm more inclined to think it's a CCX activity current accumulator value.

PJVol commented 2 years ago

@irusanov btw, I've corrected labels for the 400005 pmtable version as much as possible (if you ever need it):

ptable.400005.txt or in a ZenStatesDebugTool fork in my repo (just pushed).

PS: Tested it on my 5700G - all is fine.

timedrapery commented 2 years ago

@irusanov btw, I've corrected labels for the 400005 pmtable version as much as possible (if you ever need it):

ptable.400005.txt or in a ZenStatesDebugTool fork in my repo (just pushed).

PS: Tested it on my 5700G - all is fine.

@PJVol

You're a righteous fella, thank you for all your efforts and for your time!

Now, really stupid-seeming question inbound...

How do I change what's available here on GitHub, either a release or a source code download, to show labels such as what @PJVol authored rather than indexes and offsets when I'm running the power table monitoring bit of the SMUDebugTool?

irusanov commented 2 years ago

You can compare my PowerTableMonitor.cs with his modified file, @PJVol added 2 arrays with labels there. You also need to add a new column TLabel from the designer or rename the Index to TLabel. The other option would be to fork his version of the app or to get just the files starting with PowerTable and replace in this repo.

@PJVol thanks, I would need to increase the table size in the dll for that version.

timedrapery commented 2 years ago

You can compare my PowerTableMonitor.cs with his modified file, @PJVol added 2 arrays with labels there. You also need to add a new column TLabel from the designer or rename the Index to TLabel. The other option would be to fork his version of the app or to get just the files starting with PowerTable and replace in this repo.

@PJVol thanks, I would need to increase the table size in the dll for that version.

Thanks @irusanov!!! I'll get on with doing this ASAP!

timedrapery commented 2 years ago

@irusanov I did as you've talked about and that worked out great for me! Thank you large amounts for all of your efforts and time that you've put forth towards advancing work on these projects!!!

@PJVol thank you as well, sir! You guys are doing great things!

patrickschur commented 2 years ago

@PJVol Cac itself is a value derived from process node characteristics and architecture implementation specifics of that block, and is not supposed to change

CAC weights are often updated with firmware updates.

PJVol commented 2 years ago

@patrickschur CAC weights are often updated with firmware updates.

Ahh... you mean those used by Cac interface (mentioned in 16h BKDG) ? They referred to as "power credits" there ) It'd would be nice if someone send me BKDG for 17h or 19h :)