xenserver / status-report

Program that gathers data for xenserver host diagnostics
GNU Lesser General Public License v2.1
1 stars 9 forks source link

CP-49659: Test `modinfo` returning non-ASCII characters and fix the triggered error #112

Closed bernhardkaindl closed 3 months ago

bernhardkaindl commented 3 months ago

Fix CP-49659:

XS8: If these kernel modules are loaded, the bugtool file modinfo.out is broken:

bq2415x_charger bq2415x charger driver
cb710           ENE CB710 memory card reader driver
cb710-mmc       ENE CB710 memory card reader driver
cdc_mbim        USB CDC MBIM host driver              (for MBIM Mobile broadband modems)
dell-smm-hwmon  Dell laptop SMM BIOS hwmon driver     (for Dell Laptops)
gl520sm         GL520SM driver                        (I2C temperature/voltage Sensor)
i2c-via         i2c for Via vt82c586b southbridge     (Via Chipsets)
nf_dup_ipv4     nf_dup_ipv4: Duplicate IPv4 packet
nf_dup_ipv6     nf_dup_ipv6: IPv6 packet duplication
nft_socket      nf_tables socket match module
nft_tproxy      nf_tables tproxy support module
qmi_wwan        Qualcomm MSM Interface (QMI) WWAN driver
ums-isd200      Driver for In-System Design, Inc. ISD200 ASIC
via686a         VIA 686A Sensor device               (I2C temperature/voltage Sensor)
xt_TEE          Xtables: Reroute packet copy

That means it should only happen if the server uses one of the two I2C sensors or the cb710, or the customer sets up very specific network filtering rules manually in Dom0.

In these rare cases, the issue surfaces by the bugtool file modinfo.out having this content:

Traceback (most recent call last):
  File "./xen-bugtool", line 733, in collect_data
    s = no_unicode(v["func"](cap))
  File "./xen-bugtool", line 1640, in module_info
    return output.getvalue().decode()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 48009: ordinal not in range(128)

Reproduced by loading one of the kernel modules using:

modprobe cb710
xen-bugtool -y --entries=kernel-info

This command would gather data to diagnose the trigger in a specific case:

modinfo `cut -d' ' -f1 /proc/modules` >modinfo.out

Script to produce or confirm the list of kernel modules triggering this issue on such kernel:

find /usr/lib -name *.ko | sed -n 's|.*/||;s/.ko//p'|xargs modinfo >modinfo.out
for i in `grep -P '([^\x00-\x7F]|filename)' modinfo.out|grep -B1 -P '([^\x00-\x7F])'|sed -n 's|.*/||;s/.ko//p'|sort`;do printf %-16.16s $i;modinfo -d $i;done
coveralls commented 3 months ago

Coverage Status

coverage: 93.042% (+0.007%) from 93.035% when pulling fafb78446a800a771f5af871e54b39ff083c6e40 on xenserver-next:CP-49659-fix-modinfo-unicode-error into c8aae6e678cb5384343eca7392d8124542c40639 on xenserver:master.