frankcrawford / it87

202 stars 39 forks source link

Gigabyte A320M-S2H V2 board test with IT8686E rev.2 chip #6

Closed SimonLitt closed 1 year ago

SimonLitt commented 1 year ago

0001-it87-format.zip Hello! Thanks for your work! Unfortunately, the number of differences between your driver and the driver from the hwmon-next repository is so large that your code will not be included in the kernel for a long time. However, I am following this process with great interest. I have a gigabyte motherboard with an IT8686E chip: # dmesg | grep Gig [ 0.000000] DMI: Gigabyte Technology Co., Ltd. A320M-S2H V2/A320M-S2H V2-CF, BIOS F55a 07/29/2022

If you need help testing the driver on my motherboard, you can contact me. In the meantime, in this post, I'm including a formatting diff file that will reduce the diff just a little bit.

frankcrawford commented 1 year ago

@SimonLitt thanks for your comment and your patch. I'm not sure I'll apply the patch as is, because some of the formatting in hwmon-next version is inconsistent and I want to gradually clean that up too, but I can use that patch to highlight some of those issues and clean it up, so it is just that little different.

As you say, there are lots of differences, and it will take a while but I'll keep chugging away, and hopefully much of it will get covered this year (and yes I know it is only January!). So far, I've been learn the submission process, but now I understand it better, I can hopefully churn through them quicker.

As for help, thanks for the offer, and I may take you up on it. Out of interest, does your board have the ACPI conflict issue, and if so, do you want me to exclude it automatically in my git version?

Any other issues you see, feel free to let me know.

Thanks Frank

SimonLitt commented 1 year ago

Hi, @frankcrawford! I have conflict issue at my board: `[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009ffff] usable

[ 8.719689] it87: it87 driver version [ 8.719770] it87: Found IT8686E chip at 0xa40, revision 2 [ 8.719825] it87: Beeping is supported [ 8.719844] ACPI Warning: SystemIO range 0x0000000000000A45-0x0000000000000A46 conflicts with OpRegion 0x0000000000000A45-0x0000000000000A46 (\GSA1.SIO1) (20220331/utaddress-204)` Full dmesg log at the attachment dmesg.zip

do you want me to exclude it automatically in my git version

It doesn't bother me much. I don't know how to solve this problem. I had never studied anything about driver development before. And I don't quite understand the logic of choosing indents for memory allocation in this module. And I do not know how much work is required for this. Therefore, do what is more convenient for you.

I use a kernel that has been locally patched to use your module.

frankcrawford commented 1 year ago

@SimonLitt unfortunately, fixing the ACPI warning is not simple. In reality it is probably caused by a bug in the Gigabyte BIOS, as a couple of boards actually do use that region, but most don't, but they still leave it blocked even on those boards.

Currently the simplest "fix" is to add the ignore_resource_conflict=1 or it87.ignore_resource_conflict=1 depending on if you load it as a module, or build it into the kernel.

A fix that means you don't need to add any command line options is to add your board into the DMI table named it87_dmi_table. You will need to include the option &it87_acpi_ignore as well, so just duplicate some other single chip board, like the 'Z490 AORUS ELITE AC'.

In fact, if you can confirm that using the ignore_resource_conflict option works, I can look at adding it into the current git repo.

One other thing you should try, although I suspect it will fail is to do: sudo modprobe gigabyte-wmi force_load=1

This will see if it actually is one of the few motherboards that actually support WMI via ACPI. I suspect you will get a message that gigabyte-wmi did not find a matching sensor. If it does find one, we should get it added to that module as well.

Regards Frank

SimonLitt commented 1 year ago

Hi, @frankcrawford! The gigabyte-wmi module was not allowed in my config. I rebuilt the kernel, and successfully loaded this module. But after reloading, I couldn't load it. I spent a lot of time guessing that the module, although compiled separately, is automatically loaded and needs to be unloaded first

modprobe -r it87
modprobe -r gigabyte-wmi
modprobe gigabyte-wmi force_load=1
modprobe it87
modprobe -r it87
modprobe it87 ignore_resource_conflict=1
[ 3906.815654] gigabyte-wmi DEADBEEF-2001-0000-00A0-C90629100000: Forcing load on unknown platform
[ 4103.516489] it87: it87 driver version <not provided>
[ 4103.516584] it87: Found IT8686E chip at 0xa40, revision 2
[ 4103.516640] it87: Beeping is supported
[ 4103.516664] ACPI Warning: SystemIO range 0x0000000000000A45-0x0000000000000A46 conflicts with OpRegion 0x0000000000000A45-0x0000000000000A46 (\GSA1.SIO1) (20220331/utaddress-204)
[ 4103.516671] ACPI: OSL: Resource conflict; ACPI support missing from driver?
[ 4103.516672] ACPI: OSL: Resource conflict: System may be unstable or behave erratically
[ 4603.908688] it87: it87 driver version <not provided>
[ 4603.908772] it87: Found IT8686E chip at 0xa40, revision 2
[ 4603.908827] it87: Beeping is supported
[ 4603.908850] ACPI Warning: SystemIO range 0x0000000000000A45-0x0000000000000A46 conflicts with OpRegion 0x0000000000000A45-0x0000000000000A46 (\GSA1.SIO1) (20220331/utaddress-204)
[ 4603.908856] ACPI: OSL: Resource conflict; ACPI support missing from driver?
[ 4603.908857] ACPI: OSL: Resource conflict: System may be unstable or behave erratically

sensors:

gigabyte_wmi-virtual-0
Adapter: Virtual device
temp1:        +30.0°C
temp2:        +52.0°C
temp3:        +31.0°C
temp4:        +16.0°C
temp5:        +33.0°C
temp6:        +41.0°C

it8686-isa-0a40
Adapter: ISA adapter
Vcore:       732.00 mV (min =  +0.00 V, max =  +3.06 V)
+3.3V:         3.39 V  (min =  +0.00 V, max =  +5.05 V)
+12.0V:       12.31 V  (min =  +0.00 V, max = +18.36 V)
+5.0V:         5.13 V  (min =  +0.00 V, max =  +7.65 V)
VSOC:          1.12 V  (min =  +0.00 V, max =  +3.06 V)
VDDP:        924.00 mV (min =  +0.00 V, max =  +3.06 V)
DRAM:          1.25 V  (min =  +0.00 V, max =  +3.06 V)
3VSB:          3.34 V  (min =  +0.00 V, max =  +6.12 V)
VBAT:          3.31 V
CPU Fan:      798 RPM  (min =   10 RPM)
SYS Fan 1:      0 RPM  (min =    0 RPM)
System:       +30.0°C  (low  = -128.0°C, high = +80.0°C)  sensor = thermistor
Chipset:      +52.0°C  (low  = -128.0°C, high = +80.0°C)  sensor = thermistor
CPU:          +31.0°C  (low  =  +0.0°C, high = +80.0°C)  sensor = AMD AMDSI
PCIe x16:     +16.0°C  (low  =  +0.0°C, high = +127.0°C)  sensor = thermistor
VRM:          +33.0°C  (low  =  +0.0°C, high = +80.0°C)  sensor = thermistor
VSoC:         +41.0°C  (low  =  +0.0°C, high = +80.0°C)  sensor = thermistor
Intrusion:   ALARM

But I also forgot to update your module to the latest version. So I updated to your latest version and test. But the conflict warning remains.

modprobe  gigabyte-wmi force_load=1
insmod /lib/modules/6.1.8-gentoo-x86_64/kernel/drivers/hwmon/hwmon-vid.ko.gz
insmod /home/_simon/_prg/it87/it87.ko ignore_resource_conflict=1
[ 7229.516679] gigabyte-wmi DEADBEEF-2001-0000-00A0-C90629100000: Forcing load on unknown platform
[ 7252.820234] it87: it87 driver version .20230129
[ 7252.820308] it87: Found IT8686E chip at 0xa40, revision 2
[ 7252.820365] it87: Beeping is supported
[ 7252.820388] ACPI Warning: SystemIO range 0x0000000000000A45-0x0000000000000A46 conflicts with OpRegion 0x0000000000000A45-0x0000000000000A46 (\GSA1.SIO1) (20220331/utaddress-204)
[ 7252.820395] ACPI: OSL: Resource conflict; ACPI support missing from driver?
[ 7252.820396] ACPI: OSL: Resource conflict: System may be unstable or behave erratically

And then I made the following changes at the it87_dmi_table:

IT87_DMI_MATCH_GBT("A320M-S2H V2", it87_dmi_cb,
               &it87_acpi_ignore),
        /* IT8686E */

And of course nothing has changed, the conflict remains.

If it does find one, we should get it added to that module as well.

This module finds the chip. How else can I help?

I also have the following warning: # dmesg | grep 8686 [ 1.506594] gpio_it87: Unknown Chip found, Chip 8686 Revision 2 I'm going to try tonight to see if that makes any difference.

Regards Simon

frankcrawford commented 1 year ago

@SimonLitt That conflict message will always remain, but it is a warning only, however, if you don't specify ignore_resource_conflict=1 it will cause the it87 module or kernel option to fail.

There is really no way to suppress that message, just to make sure it does not affect anything.

However, more interesting is that your motherboard does support the WMI method and hence the ACPI setup is correct. You can see this in the sensor output for gigabyte_wmi-virtual-0. With your permission, I'll let the maintainer of that module know he can add your motherboard as a match.

Personally, I don't believe there is an issue with running both (the WMI method only supports temperatures not fans, etc), but technically you do run the risk of a deadlock/conflict, but I've never seen it in any of my systems.

Regards Frank

frankcrawford commented 1 year ago

P.S. the gpio_it87 message is from a totally different module, not related to sensors, but one I will look at to see what it does.

SimonLitt commented 1 year ago

With your permission, I'll let the maintainer of that module know he can add your motherboard as a match.

Yes, sure. It will be easier for you, you already know how to do it right.

SimonLitt commented 1 year ago

@frankcrawford

That conflict message will always remain, but it is a warning only, however, if you don't specify ignore_resource_conflict=1 it will cause the it87 module or kernel option to fail.

There is really no way to suppress that message, just to make sure it does not affect anything.

I found this place in the code, it87_device_add just doesn't return an error and print Ignoring... message, ignore_resource_conflict=1 don't print anything. How long and how should I test the system with the it87_acpi_ignore parameter?

SimonLitt commented 1 year ago

I tried to get the message "Ignoring expected ACPI resource conflict"in the logs, but I never succeeded, even when I patched the kernel again. But I found something strange. There is a conflict message, but it does not return an error:

    pr_info("before check!\n");
    err = acpi_check_resource_conflict(&res);
    pr_info("after check(%d)\n", err);
    if (err) {
[ 5160.872860] it87: Beeping is supported
[ 5160.872881] it87: before check!
[ 5160.872884] ACPI Warning: SystemIO range 0x0000000000000A45-0x0000000000000A46 conflicts with OpRegion 0x0000000000000A45-0x0000000000000A46 (\GSA1.SIO1) (20220331/utaddress-204)
[ 5160.872890] ACPI: OSL: Resource conflict; ACPI support missing from driver?
[ 5160.872891] ACPI: OSL: Resource conflict: System may be unstable or behave erratically
[ 5160.872891] it87: after check(0)

Update: I figured it out, it was because the kernel command line parameters was set to acpi_enforce_resources=lax, I installed it many years ago and already forgot about it. Now acpi_check_resource_conflict return "-16".

SimonLitt commented 1 year ago

And then I made the following changes at the it87_dmi_table:

IT87_DMI_MATCH_GBT("A320M-S2H V2", it87_dmi_cb,
             &it87_acpi_ignore),
      /* IT8686E */

My board failed to pass the test with the DMI_EXACT_MATCH(DMI_BOARD_NAME, name) macro and, accordingly, dmi_data is null. I tried to iterate over different pieces of the DMI: Gigabyte Technology Co., Ltd. A320M-S2H V2/A320M-S2H V2-CF, BIOS F55a 07/29/2022 string. But it still didn't work until I replaced DMI_EXACT_MATCH(DMI_BOARD_NAME, name) to DMI_MATCH(DMI_BOARD_NAME, name). And then it solved the problem. But it seems to me that this is not the right solution.

P.S. For the vendor, the DMI_EXACT_MATCH(DMI_BOARD_VENDOR, vendor) macro continues to work normally, so my working macro is this:

#define IT87_DMI_MATCH_VND(vendor, name, cb, data) \
        { \
        .callback = cb, \
        .matches = { \
            DMI_EXACT_MATCH(DMI_BOARD_VENDOR, vendor), \
            DMI_MATCH(DMI_BOARD_NAME, name), \
        }, \
        .driver_data = data, \
    }

But still, there may be another correct solution.

Update: with using DMI_MATCH I tried string pieces from A320M to A320M-S2H V2 аnything longer strings fails the test again.

SimonLitt commented 1 year ago

.matches = { \ DMI_EXACT_MATCH(DMI_BOARD_VENDOR, vendor), \ DMI_MATCH(DMI_BOARD_NAME, name), \ }, \

Again, some oddities, but I could not establish the reason for them. I looked at other drivers to see how the DMI_EXACT_MATCH and DMI_MATCH macros are used. For example the gigabyte-wmi driver only uses DMI_EXACT_MATCH macros.

#define DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME(name) \
    { .matches = { \
        DMI_EXACT_MATCH(DMI_BOARD_VENDOR, "Gigabyte Technology Co., Ltd."), \
        DMI_EXACT_MATCH(DMI_BOARD_NAME, name), \
    }}

And after adding the following line:

DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("A320M-S2H V2"),

This driver correctly checks the board by calling the dmi_check_system(gigabyte_wmi_known_working_platforms) function. And in the it87 driver with the same code, the board does not pass the check. I expanded the macros using the -save-temps flag and made sure that the transmitted board name strings are absolutely identical. As well as identical the matches structures.

gigabyte-wmi: { .matches = { { .slot = DMI_BOARD_VENDOR, .substr = "Gigabyte Technology Co., Ltd.", .exact_match = 1 }, { .slot = DMI_BOARD_NAME, .substr = "A320M-S2H V2", .exact_match = 1 }, }}, it87: { .callback = it87_dmi_cb, .matches = { { .slot = DMI_BOARD_VENDOR, .substr = "Gigabyte Technology Co., Ltd.", .exact_match = 1 }, { .slot = DMI_BOARD_NAME, .substr = "A320M-S2H V2", .exact_match = 1 }, }, .driver_data = &it87_acpi_ignore, } it87 with the DMI_MATCH macro: { .callback = it87_dmi_cb, .matches = { { .slot = DMI_BOARD_VENDOR, .substr = "Gigabyte Technology Co., Ltd.", .exact_match = 1 }, { .slot = DMI_BOARD_NAME, .substr = "A320M-S2H V2" }, }, .driver_data = &it87_acpi_ignore, }

If the dmi_check_system function call not in module_init but in prode (as in the gigabyte-wmi), then in the check the behavior is exactly the same - it passes with the DMI_MATCH macro, it does not pass with the DMI_EXACT_MATCH macro.

frankcrawford commented 1 year ago

@SimonLitt Try a board name of A320M-S2H V2-CF and see what happens. The first part is probably DMI_SYSTEM_NAME and the second the DMI_BOARD_NAME

SimonLitt commented 1 year ago

@SimonLitt Try a board name of A320M-S2H V2-CF and see what happens. The first part is probably DMI_SYSTEM_NAME and the second the DMI_BOARD_NAME

@frankcrawford Thank you, this fix helped solve the problem!

Update: As for gigabyte-wmi, more careful testing revealed that the string A320M-S2H V2-CF should be used there too. The test output both through the pr_info function and through the dev_info function for some reason hides subsequent messages.

SimonLitt commented 1 year ago

With your permission, I'll let the maintainer of that module know he can add your motherboard as a match.

Yes, sure. It will be easier for you, you already know how to do it right.

@frankcrawford The attachment contains a patch for the gigabyte-wmi driver. wmi-patch_sensors-conf.zip P.S. Also in the attachment I included the configuration for the sensors . sensors output with this config:

it8686-isa-0a40
Adapter: ISA adapter
CPU Vcore:   732.00 mV (min =  +0.00 V, max =  +3.06 V)
+3.3V:         3.39 V  (min =  +0.00 V, max =  +5.05 V)
+12.0V:       12.31 V  (min =  +0.00 V, max = +18.36 V)
+5.0V:         5.13 V  (min =  +0.00 V, max =  +7.65 V)
VSOC:          1.12 V  (min =  +0.00 V, max =  +3.06 V)
VDDP:        936.00 mV (min =  +0.00 V, max =  +3.06 V)
DRAM:          1.24 V  (min =  +0.00 V, max =  +3.06 V)
3VSB:          3.34 V  (min =  +0.00 V, max =  +6.12 V)
VBAT:          3.31 V  
CPU Fan:      988 RPM  (min =   10 RPM)
SYS Fan 1:      0 RPM  (min =    0 RPM)
System:       +27.0°C  (low  = -128.0°C, high = +60.0°C)  sensor = thermistor
Chipset:      +49.0°C  (low  = -128.0°C, high = +70.0°C)  sensor = thermistor
CPU:          +37.0°C  (low  =  +0.0°C, high = +80.0°C)  sensor = AMD AMDSI
PCIe x16:     +16.0°C  (low  =  +0.0°C, high = +127.0°C)  sensor = thermistor
VRM MOS:      +34.0°C  (low  =  +0.0°C, high = +90.0°C)  sensor = thermistor
VSoC MOS:     +42.0°C  (low  =  +0.0°C, high = +80.0°C)  sensor = thermistor
Intrusion:   ALARM

gigabyte_wmi-virtual-0
Adapter: Virtual device
System:       +27.0°C  
Chipset:      +49.0°C  
CPU:          +35.0°C  
PCIe x16:     +16.0°C  
VRM MOS:      +34.0°C  
VSoC MOS:     +42.0°C  

I found the initial config on the Internet for the 8686 chip when I started using your driver. Now I am convinced that the temperature labels are correct. But I couldn't check the voltage. It's hard for me to find the CPU VTT Voltage and the CPU VRIN voltages that are displayed in the BIOS. All these voltages are close and constantly changing, so I left these labels as they are.

Respectfully Simon

frankcrawford commented 1 year ago

@SimonLitt FYI, I've added your board to the DMI list for the my latest it87.c driver and pushed it, and also submit a patch for adding it to the gigabyte-wmi driver in the Linux kernel.