Dasharo / dasharo-issues

The Dasharo issue tracker
https://dasharo.com/
25 stars 0 forks source link

XMP on MSI PRO Z690-A WIFI DDR4 first impressions #605

Open zirblazer opened 11 months ago

zirblazer commented 11 months ago

Dasharo version Dasharo v1.1.3 (RC? Not dated) for MSI PRO Z690-A WIFI DDR4

Hardware compatibility Pretty much same system than here: https://github.com/Dasharo/dasharo-issues/issues/158 Processor: Intel Core i5 12600K Memory Modules: 4 x Kingston Fury Beast Black KF432C16BB1/16 Datasheet: https://www.kingston.com/dataSheets/KF432C16BB1_16.pdf • Default (JEDEC): DDR4-2400 CL17-17-17 @ 1.2V • XMP Profile 1: DDR4-3200 CL16-18-18 @ 1.35V • XMP Profile 2: DDR4-3000 CL15-17-17 @ 1.35V PCIe Cards: 2 x Radeon 5600XT

Software used: Windows 11 21H2 + all Windows Update updates MSI Dragon Power 1.0.0.19 / 20231127 MSI Dragon Ball 1.0.0.12 / 20231127 AsRock Timing Configurator 4.0.14 ASUS MemTweakIt 2.3.18.0 TestMem5 0.12 with Extreme1 @ anta777.cfg profile

First of all: Dasharo loading XMP profiles is extremely experimental and may be prone to bricking, so having an USB Flash Drive ready to use FlashBIOS is almost mandatory if you want to play with XMP. You have been warned.

The Memory Configuration interface allows to choose a total of four profiles: JEDEC (Default), XMP1, XMP2, XMP3. And I'l begin with XMP3 because it was an horrible experience.

Choosing XMP3, which in my modules is essencially a non existing profile (It is supposed to be for DDR5), had catastrophic results since system entered an infinite POST loop with the Case front panel HD status light blinking ocassionally as if it had a memory training failure. After a long timeout (Somewhere at 6 or 10 minutes) it powered off by itself, then powered on again, with same behavior. When I decided to turn it off to use FlashBIOS, I pressed the Power Button 4 seconds as always, system powered off, then about 2 or 3 seconds later powered on completely on its own. Tried it again 2 or 3 times until I decided to power off the Power Supply instead, wait a bit, then turn it on again, with it finally staying off. Note than on Firmware settings I had Power off on power loss (Default), but I would assume than if it is possible to power on the system on power loss, you could enter an infinite loop where you can't keep the computer powered off while having standby power to use FlashBIOS. Essencially, XMP3 on DDR4 is pretty much a bricking sentence because it doesn't seem that you can recover in any other way than FlashBIOS. Moreover, I don't have JEDEC-only modules to check if this behavior would also happen if there aren't XMP1/XMP2 profiles. So, this feature needs A LOT of sanity checks as otherwise it will very likely leave a lot of unhappy users.

XMP2 (3000 MHz) worked, but took about 3 minutes doing memory training before displaying the Dasharo splash screen, whereas standard JEDEC does so in about 40-50 seconds (With 64 GiB RAM).

XMP1 (3200 MHz) also worked, but the wait was even longer. It took about 4 minutes training and I actually believed than it bricked, so I powered off, then decided to try again by powering it on again, and after about 4 or 5 minutes, it finally POSTed. Reboots were fast as usual, but the training time is what kills it because you're not even sure if you actually bricked or if at the end it is gonna POST. I got baited around 3 times when I mentioned that it appeared to be bricked after waiting several minutes then it inmediately decides to POST in my face, heh. I think the best approach would be to set the XMP profile, save settings, power off, power on, then measure how much it takes to POST.

I tested first with XMP1 profile and had ocassional BSODs or strange behavior like Chrome just displaying a white screen, or some Windows applications hangs, which ended up with me downloading a bunch of overclocking related tools to verify what settings the RAM was running at and stability stress test. As expected, the BSODs was related to RAM issues:

testing_dasharo-1-1-3_ram-xmp1-tm5-fail

There was a previous run where I opened it, counted a single error 3 seconds in, and Windows BSODed inmediately. I also tested TM5 during the night with XMP2 profile and there were no error reported nor any abnormal behavior. So what is wrong with XMP1? Well, that is what the other tools are for.

JEDEC testing_dasharo-1-1-3_ram-jedec testing_dasharo-1-1-3_ram-jedec-memtweakit

XMP2 (3000 MHz) testing_dasharo-1-1-3_ram-xmp2 testing_dasharo-1-1-3_ram-xmp2-memtweakit

XMP1 (3200 MHz) testing_dasharo-1-1-3_ram-xmp1 testing_dasharo-1-1-3_ram-xmp1-memtweakit

While I didn't tested MSI Firmware for this comparison yet in what loading XMP profiles defaults to, these are the things I noticed: DRAM Voltage is reported as 1.2V. There are very few utilities that can actually read this value, but both MSI Dragon Power and HWinfo64 agrees on it (I think Intel XTU should be able to do so, but the Memory tab doesn't appear most likely due to Real Time Memory Timings not being enabled). Both XMP profiles ask for 1.35V, so it is likely than the reason why 3200 MHz is not stable is because 1.2V (Which is standard for DDR4 JEDEC) is not enough. Gear 2 is active for both XMP Profiles, causing the Memory Controller Frequency to drop in half and making performance gains over JEDEC due to the higher memory clock speed rather questionable. Command Rate when using the XMP profiles is 1T whereas it is most likely intended to be 2T. I don't recall seeing anything above 2666 MHz or so at 1T. It doesn't increase performance by any significant marging but makes RAM stability much harder to achieve, again putting pressure on the 3200 MHz issues. XMP2 profile seems to use a 133 MHz reference clock and rounds down to 2933 MHz instead of using a 100 MHz clock for a rounded 3000 MHz. Ironically 133 MHz base is supposed to be considered more stable, but on this particular case it is rounding down (For 3200 MT/s, 133 12 is better than 100 16, but for 3000 MT/s it is either 100 15 or rounding down to 133 11 for 2933 MT/s).

So, with XMP on Dasharo lacking any manual fine tuning plus being rather risky to use I wouldn't recommend to mess with it yet. Is good to finally be able to test it, but it requires A LOT of work in means of recovery (Some OC Watchdog that if it fails to POST, resets memory configuration to JEDEC. Or even maintain multiple training caches to reset to JEDEC in seconds instead of retraining it. Good thing is that FlashBIOS works because otherwise you will require external reprogramming).

SergiiDmytruk commented 11 months ago

Maybe RAM voltage needs to be set separately from profile via VddVoltage, that would explain why it's not applied.

Doing this automatically, however, implies ability to parse memory profiles and there is no public specification of XMP 2.0 or XMP 3.0. Looks like https://github.com/integralfx/DDR4XMPEditor does really parse/generate XMP 2.0 (based on this thread), but nothing similar for XMP 3.0 (don't know how hard it would be to reverse-engineer the format based on SPD dumps).

zirblazer commented 11 months ago

Maybe RAM voltage needs to be set separately from profile via VddVoltage, that would explain why it's not applied.

The XMP Profiles do include DRAM Voltage, so it should be possible to automatically apply it. However, manual voltage control is good because it shoves the responsability of knowing what it is happening to the user. Note than the 1435mv maximum voltage limit on that link seems very low, see below...

Also, one thing to keep in mind is that increasing DRAM Voltage to match the XMP profile may not be enough in certain cases. For Alder Lake and Raptor Lake, Intel guarantees DDR4 3200 support across any possible configuration (Including 2 DIMMs Per Channel, 2 Ranks per DIMM). This means than my XMP 3200 modules should very likely work as intended by simply increasing DRAM Voltage to 1.35V as per the profile, since the Processor integrated Memory Controller is running completely within spec.

However, if you are using XMP profiles higher than 3200, is very possible than you also have to feed higher voltage to the Processor Memory Controller. This means, that, for example, these modules: GSkill Ripjaws V - 16GB (2 x 8 GiB) - DDR4-5333 - CL22-32-32-52 - 1.60V - F4-5333C22D-16GVK GSkill Ripjaws V - 16GB (2 x 8 GiB) - DDR4-4266 - CL16-19-19-39 - 1.45V - F4-4266C16D-16GVK Kingston Fury Renegade - 16GB (2 x 8 GiB) - DDR4-5333 - CL20-30-30 - 1.60V - KF453C20RBK2/16 also DDR4-4000 - CL19-23-23 - 1.35V

...would very likely be completely unusable in XMP mode because the Processor Memory Controller would not be stable at the required speeds without overvolting (Also note than these modules require 1.45V and 1.6V DRAM Voltage! That is the manufacturer own specifications so you are not technically overvolting the DIMMs. Pretty much all mainstream 3200 MHz XMP modules are 1.35V, the above modules are pretty much edge cases). This of course is a bad idea to do automatically because it is warranty voiding (Some Motherboard vendors used to do that, and they got critized for it), but without some kind of manual control you would never get such modules to be stable at their rated speeds. Gear 2 mode (Or Gear 4 for DDR5) is supposed to help here, as it is intended to halve the Memory Controller clock speed if you're using faster than 3200 MHz modules as to not overclock it (This is what Motherboard vendors seem to do currently, default to Gear 2 if the XMP profile wants higher than officially supported clock speeds. This literally kills performance since it is barely better than running Gear 1, but it is less warranty voiding and makes unaware users happy that they're running their RAM at rated speeds...).

Doing this automatically, however, implies ability to parse memory profiles and there is no public specification of XMP 2.0 or XMP 3.0. Looks like https://github.com/integralfx/DDR4XMPEditor does really parse/generate XMP 2.0 (based on this thread), but nothing similar for XMP 3.0 (don't know how hard it would be to reverse-engineer the format based on SPD dumps).

Several Windows applications are capable of parsing DDR4/DDR5 XMP profiles. Check here for two DDR5 examples, CPU-Z and Thaiphoon Burner. HWinfo64 should also be capable. Also, there is the AMD EXPO specification, which coexists on the DDR5 DIMM SPD along with XMP profiles. I have no idea how Coreboot/Intel FSP currently parses it. For example, in the Kingston KF552C36BBE-8 datasheet you can see it has two EXPO profiles, two XMP profiles, and the standard JEDEC profile, all in one. EXPO is supposed to be license and royalty free but I can't find the specification...

Also, precisely due to the lack of knowing what the XMP 3.0 specification for DDR5 says, I don't even know whenever the XMP profiles have Voltages beyond DRAM Voltage. For example, if you check the four slides near the bottom of this DDR5 review, I'm suspicious that XMP profiles have VDD, VDDQ and VPP, thus not only classic DRAM Voltage. Oh, also, besides the three static XMP profiles, there are two user profiles that with specialized tools you're supposed to be able to program into the SPD.

Also, from the DDR4XMPEditor tool you linked... https://github.com/integralfx/DDR4XMPEditor/issues/23 https://gist.github.com/integralfx/bfaa68a39b63ad44184f426bb6bfc9e4 You will find the second link interesing.

miczyg1 commented 11 months ago

From what I see in the FSP code, the DRAM voltages (VDD, VDDQ, VPP) are taken from the XMP profile, so no idea why it ends up undervolted for given profile. Also FSP has checks whether DIMM supports XMP and if it should ignore the values stored on XMP SPD offsets.

zirblazer commented 11 months ago

From what I see in the FSP code, the DRAM voltages (VDD, VDDQ, VPP) are taken from the XMP profile, so no idea why it ends up undervolted for given profile. Also FSP has checks whether DIMM supports XMP and if it should ignore the values stored on XMP SPD offsets.

Note than VDD, VDDQ and VPP are for DDR5, DDR4 should only have VDD in its XMP profile. Could be the case that it actually works for DDR5 which would make this issue DDR4 only, but I can't test that for obvious reasons. Another possibility is that the DRAM Voltage read from Software is wrong (I don't know where it sources it from), which could be potentially reproduced in MSI BIOS if manually inputting the same Timmings and Voltage also causes a flood of memory errors. Multimeter testing... urgh, better not.

miczyg1 commented 10 months ago

I suspect the MSI dragon Power reads the values from Super I/O or something like that. From debugging the FSP I can see that VDD is picked up from SPD profile. However the VDDQ remains unchanged for DDR4 (DDR5 picks it up from SPD too).

VDDQ is the voltage applied directly to DRAM modules (DRAM voltage (Vddq) (supply voltage for DQ/DQS of the DRAM chips) in millivolts per FSP header). No idea why FSP does not raise it automatically, despite the VDD is raised to 1.35V. VDD is according to FSP headers DRAM voltage (Vdd) (supply voltage for input buffers and core logic of the DRAM chips)

I guess I could add an override for it, e.g. if XMP1 or XMP2 is selected, then set the VDDQ to VDD for DDR4. But that will require XMP profile parsing.

Regarding DDR5, I will update on the XMP later.

miczyg1 commented 10 months ago

The DIMM voltage in DMI is not the voltage of the physical I/O pins. I just learned it by studying FSP code.

It is simply the VddVoltage value not VddqVoltage value (per FSP headers).

So basically what FSP does is set the voltage for input buffers and core logic and expose it as a DIMM VDD voltage in DMI/SMBIOS. Kinda misleading...

VDDQ for DDR4 is impossible to change in public FSP according to the source code I have access to... DDR5 is allowed to change VDDQ by setting the parameter in FSP. VDDQ, or rather "real" DRAM voltage, can be measured by MSI utilities or the nct6687d Linux module. It shows the value measured by Super I/O directly on DIMMs

miczyg1 commented 10 months ago

One can not set the VDDQ to something different than 1.2V for DDR4 path. In FSP before AlderLake it was possible regardless of DDR type. So basically DDR4 XMP is half broken and without fixing it on FSP side, the voltage will not be properly set.