MEN-Mikro-Elektronik / 13MD05-90

MDIS5 System Package for Linux (including drivers)
Other
4 stars 4 forks source link

SMBus multiplied and losing addresses #235

Open duagon-rvw opened 2 years ago

duagon-rvw commented 2 years ago

In newer Linux kernels, first recognized in Ubuntu 18.04 kernel 4.15, one of the two AMD SMBus is splitted in four. Kernel 4.4 has just two, see below. This leads to issues with the system scan script of MDIS as well to issues with Linux and MDIS Tools.

The system.dsc has to be manually configured because it does always get the wrong bus. If there is an access to i2c-3 there is a chance every address of the four busses at 0b00 will be removed. When MDIS drivers are loaded "oss_smb2_detect" will look up all addresses on all busses, receives timouts at i2c-3 and removes all of the clients at the end again. Sporadically we will find "i2c i2c-0: Failed! (01)" in dmesg and afterwards the clients will not recover anymore until reboot. There is a similiar issue with the BL72 (MAIN_PR007753). The difference is the BMC at the BL72 is dissappearing just for a short time (several seconds). At the SC24 platforms it will not come back until a reboot.

The first output in i2c-0.txt is right before the drivers were loaded, on kernel 5.4, and the second right after they were loaded and "sudo xm01bc_ctrl -f xm01bc_1" was executed.

These issues are reproducable on Ubuntu and Debian.

Ubuntu Kernel 4.4.0-31-generic:

dua@dua-Persimmon:~$ sudo i2cdetect -l
i2c-0   i2c             Radeon i2c bit bus 0x90                 I2C adapter
i2c-1   i2c             Radeon i2c bit bus 0x91                 I2C adapter
i2c-2   i2c             Radeon i2c bit bus 0x92                 I2C adapter
i2c-3   i2c             Radeon i2c bit bus 0x93                 I2C adapter
i2c-4   i2c             Radeon i2c bit bus 0x94                 I2C adapter
i2c-5   i2c             Radeon i2c bit bus 0x95                 I2C adapter
i2c-6   i2c             Radeon i2c bit bus 0x96                 I2C adapter
i2c-7   i2c             Radeon i2c bit bus 0x97                 I2C adapter
i2c-8   i2c             card0-DP-1                              I2C adapter
i2c-9   i2c             card0-DP-2                              I2C adapter
i2c-10  smbus           SMBus PIIX4 adapter at 0b00             SMBus adapter
i2c-11  smbus           SMBus PIIX4 adapter at 0b20             SMBus adapter

Ubuntu Kernel 4.15.0-87-generic:

dua@dua:~$ sudo i2cdetect -l
i2c-3    smbus       SMBus PIIX4 adapter port 4 at 0b00      SMBus adapter
i2c-10  i2c             Radeon i2c bit bus 0x95                         I2C adapter
i2c-1    smbus       SMBus PIIX4 adapter port 2 at 0b00      SMBus adapter
i2c-8    i2c             Radeon i2c bit bus 0x93                         I2C adapter
i2c-6    i2c             Radeon i2c bit bus 0x91                         I2C adapter
i2c-13  i2c             card0-DP-1                                             I2C adapter
i2c-4    smbus       SMBus PIIX4 adapter port 1 at 0b20      SMBus adapter
i2c-11  i2c             Radeon i2c bit bus 0x96                         I2C adapter
i2c-2    smbus       SMBus PIIX4 adapter port 3 at 0b00      SMBus adapter
i2c-0    smbus       SMBus PIIX4 adapter port 0 at 0b00      SMBus adapter
i2c-9    i2c             Radeon i2c bit bus 0x94                         I2C adapter
i2c-7    i2c             Radeon i2c bit bus 0x92                         I2C adapter
i2c-14  i2c             card0-DP-2                                             I2C adapter
i2c-5    i2c             Radeon i2c bit bus 0x90                         I2C adapter
i2c-12  i2c             Radeon i2c bit bus 0x97                         I2C adapter

5.4.0-87-genericdmesg.txt i2c-0.txt Ubuntu1604dmesg.txt

M-Gerner commented 2 years ago

In the file 5.4.0-53-generic-i2cdetect.txt you can see how the SMBus addresses are distributed between the i2c addresses. The SMBus with the offset 0b00 is split into four and the only important one of thes is i2c-0. The rest is either empty or does empty all other with the same offset (i2c-3). In the scansystem script it will just pick the first it can find in the list which is most of the time i2c-3. It is better to start with i2c-0 and check if there is address "4d" and "50" available. I2c-2 has also a "4d" but it will not react to any command when it is also at i2c-0. 4.4.0-210-generic-i2cdetect.txt 5.4.0-53-generic-i2cdetect.txt

For the BL72E it is more simple (see i2c_BL72+10k_Resistor.txt). One of the split ones is completly empty (i2c-1) so i2c-0 would be, in this example, the correct one at 0b00. Here it is just necessary to use the non empty one. i2c_BL72+10k_Resistors.txt

duagon-rvw commented 2 years ago

Helpful Information: https://www.kernel.org/doc/html/latest/i2c/busses/i2c-piix4.html

The AMD SB700, SB800, SP5100 and Hudson-2 chipsets implement two PIIX4-compatible SMBus controllers. If your BIOS initializes the secondary controller, it will be detected by this driver as an “Auxiliary SMBus Host Controller”.

Driver File in Version 4.4 in the Linux Kernel: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-piix4.c?h=v4.4

Driver File in Version 4.15 in the Linux Kernel https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/i2c/busses/i2c-piix4.c?h=v4.15

Git Log between 4.15 and 4.4 (~ commit at 2015-06-16) https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/drivers/i2c/busses/i2c-piix4.c?h=v4.15

Using git bisect on this file will take ~3 steps to pinpoint the commit.

 orodruin > ~/s/linux > ➦ afd2ff9b7e1b  >>> git bisect start -- drivers/i2c/busses/i2c-piix4.c
Updating files: 100% (88154/88154), done.
Previous HEAD position was afd2ff9b7e1b Linux 4.4
Switched to branch 'master'
Your branch is up to date with 'origin/master'.
 orodruin > ~/s/linux >  master  >>> git bisect good v4.4
 orodruin > ~/s/linux >  master >>> git bisect bad v4.15
Bisecting: 8 revisions left to test after this (roughly 3 steps)
[62194e869a56bf9d6fc10b6bdf8f11b1c4386249] i2c: piix4: Always use the same type for port
 orodruin > ~/s/linux > ➦ 62194e869a56  >>>

Looking at the Code I think that this is what causes our changed behaviour https://lore.kernel.org/all/1447960429-19256-1-git-send-email-fetzer.ch@gmail.com/#t

with the according pull request into v4.5 https://lore.kernel.org/all/20160114191829.GA1990@katana/

So it should be enough to quickly check if 4.5 is already behaving like this and than adapt MDIS and the scan_system.sh to properly handle this.