fwupd / firmware-lenovo

Missing firmware for Lenovo Thinkpad hardware
116 stars 4 forks source link

ThinkPad Thunderbolt 4 Docking Station - Ethernet crashes since fwupd to 1.7.4 #191

Open fbatschi opened 2 years ago

fbatschi commented 2 years ago

Describe the bug Since updating fwupd to 1.7.4 there is apparently the possibility to do firmware updates for the Lenovo ThinkPad Thunderbolt 4 docking station. After applying the firmware update for the docking station on my X1 Gen9 with the Thunderbolt 4 docking station there is a crash and the system freezes once the system tries to activate the Ethernet port of the docking station.

The following can be seen in journalctl

Jan 21 22:28:02 fedora kernel: igc: Failed to read reg 0xc030!
Jan 21 22:28:02 fedora kernel: WARNING: CPU: 7 PID: 2885 at drivers/net/ethernet/intel/igc/igc_main.c:6165 igc_rd32+0x7c/0x80 [igc]
Jan 21 22:28:02 fedora kernel: Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer xt_conntrack xt_MASQUERADE nf_conntrack_netlink nft_counter xt_addrtype nft_compat br_netfilter bridge stp llc nls_utf8 cifs cifs_arc4 rdma_cm iw_cm ib_cm ib_core cifs_md4 dns_resolver fscache netfs nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 overlay ip_set nf_tables nfnetlink qrtr ns bnep snd_ctl_led snd_soc_skl_hda_dsp snd_soc_intel_hda_dsp_common snd_soc_hdac_hdmi sunrpc snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_soc_dmic iTCO_wdt snd_sof_pci_intel_tgl intel_pmc_bxt pmt_telemetry mei_wdt mei_hdcp iTCO_vendor_support pmt_class snd_sof_intel_hda_common intel_rapl_msr soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof iwlmvm
Jan 21 22:28:02 fedora kernel:  snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus mac80211 snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp snd_hda_intel coretemp snd_intel_dspcfg libarc4 intel_cstate snd_intel_sdw_acpi intel_uncore pcspkr snd_usb_audio snd_hda_codec iwlwifi snd_usbmidi_lib think_lmi snd_hda_core firmware_attributes_class snd_rawmidi vfat i2c_i801 wmi_bmof i2c_smbus uvcvideo snd_hwdep snd_seq fat btusb mei_me snd_seq_device cfg80211 mei videobuf2_vmalloc btrtl snd_pcm squashfs idma64 btbcm videobuf2_memops snd_timer loop videobuf2_v4l2 btintel videobuf2_common bluetooth joydev videodev processor_thermal_device_pci_legacy processor_thermal_device mc ecdh_generic thunderbolt intel_pmt processor_thermal_rfim processor_thermal_mbox processor_thermal_rapl intel_rapl_common intel_soc_dts_iosf igen6_edac thinkpad_acpi nxp_nci_i2c nxp_nci nci ledtrig_audio platform_profile nfc snd rfkill soundcore
Jan 21 22:28:02 fedora kernel:  int3403_thermal int340x_thermal_zone soc_button_array acpi_pad intel_hid int3400_thermal acpi_tad acpi_thermal_rel sparse_keymap zram ip_tables dm_crypt trusted asn1_encoder hid_logitech_hidpp hid_logitech_dj hid_sensor_hub intel_ishtp_loader intel_ishtp_hid i915 hid_multitouch i2c_algo_bit ttm drm_kms_helper cec crct10dif_pclmul crc32_pclmul crc32c_intel drm nvme igc serio_raw intel_ish_ipc ghash_clmulni_intel intel_ishtp nvme_core ucsi_acpi typec_ucsi typec i2c_hid_acpi wmi i2c_hid video pinctrl_tigerlake ipmi_devintf ipmi_msghandler fuse
Jan 21 22:28:02 fedora kernel: CPU: 7 PID: 2885 Comm: gnome-shell Not tainted 5.15.15-200.fc35.x86_64 fwupd/fwupd#1
Jan 21 22:28:02 fedora kernel: Hardware name: LENOVO 20XWCTO1WW/20XWCTO1WW, BIOS N32ET75W (1.51 ) 12/02/2021
Jan 21 22:28:02 fedora kernel: RIP: 0010:igc_rd32+0x7c/0x80 [igc]
Jan 21 22:28:02 fedora kernel: Code: 48 c7 c6 58 e3 40 c0 e8 3a 20 85 c7 48 8b bb 30 ff ff ff e8 66 76 2e c7 84 c0 74 b2 89 ee 48 c7 c7 80 e3 40 c0 e8 a2 67 7f c7 <0f> 0b eb a0 0f 1f 44 00 00 41 56 41 55 41 54 55 48 89 f5 53 80 7e
Jan 21 22:28:02 fedora kernel: RSP: 0018:ffffa25cc5897c70 EFLAGS: 00010282
Jan 21 22:28:02 fedora kernel: RAX: 000000000000001f RBX: ffff9008cb518c10 RCX: 0000000000000027
Jan 21 22:28:02 fedora kernel: RDX: ffff900fff7e0a08 RSI: 0000000000000001 RDI: ffff900fff7e0a00
Jan 21 22:28:02 fedora kernel: RBP: 000000000000c030 R08: 0000000000000000 R09: ffffa25cc5897aa8
Jan 21 22:28:02 fedora kernel: R10: ffffa25cc5897aa0 R11: ffffffff88f46028 R12: 00000000ffffffff
Jan 21 22:28:02 fedora kernel: R13: ffff9008cb518000 R14: ffff9008cb3cdd40 R15: 000000000000c030
Jan 21 22:28:02 fedora kernel: FS:  00007f3e74c2bd80(0000) GS:ffff900fff7c0000(0000) knlGS:0000000000000000
Jan 21 22:28:02 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 21 22:28:02 fedora kernel: CR2: 000006ee3d0f4018 CR3: 0000000166320003 CR4: 0000000000770ee0
Jan 21 22:28:02 fedora kernel: PKRU: 55555554
Jan 21 22:28:02 fedora kernel: Call Trace:
Jan 21 22:28:02 fedora kernel:  <TASK>
Jan 21 22:28:02 fedora kernel:  igc_update_stats+0x72/0x690 [igc]
Jan 21 22:28:02 fedora kernel:  igc_get_stats64+0x7f/0x90 [igc]
Jan 21 22:28:02 fedora kernel:  dev_get_stats+0x59/0xc0
Jan 21 22:28:02 fedora kernel:  dev_seq_printf_stats+0x20/0xb0
Jan 21 22:28:02 fedora kernel:  dev_seq_show+0x10/0x30
Jan 21 22:28:02 fedora kernel:  seq_read_iter+0x2bf/0x4b0
Jan 21 22:28:02 fedora kernel:  seq_read+0xed/0x120
Jan 21 22:28:02 fedora kernel:  proc_reg_read+0x52/0xa0
Jan 21 22:28:02 fedora kernel:  vfs_read+0x92/0x190
Jan 21 22:28:02 fedora kernel:  ksys_read+0x4f/0xc0
Jan 21 22:28:02 fedora kernel:  do_syscall_64+0x38/0x90
Jan 21 22:28:02 fedora kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Jan 21 22:28:02 fedora kernel: RIP: 0033:0x7f3e7b0d978c
Jan 21 22:28:02 fedora kernel: Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 39 89 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 34 44 89 c7 48 89 44 24 08 e8 7f 89 f8 ff 48
Jan 21 22:28:02 fedora kernel: RSP: 002b:00007fffbb0ee1e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Jan 21 22:28:02 fedora kernel: RAX: ffffffffffffffda RBX: 000055ff98059220 RCX: 00007f3e7b0d978c
Jan 21 22:28:02 fedora kernel: RDX: 0000000000001000 RSI: 00007fffbb0ee2f0 RDI: 0000000000000042
Jan 21 22:28:02 fedora kernel: RBP: 00007f3e7b1d43a0 R08: 0000000000000000 R09: 0000000000000000
Jan 21 22:28:02 fedora kernel: R10: 0000000000001000 R11: 0000000000000246 R12: 00007fffbb0ee2f0
Jan 21 22:28:02 fedora kernel: R13: 0000000000000d68 R14: 00007f3e7b1d37a0 R15: 0000000000001000
Jan 21 22:28:02 fedora kernel:  </TASK>
Jan 21 22:28:02 fedora kernel: ---[ end trace 3bf280f689b0d891 ]---
Jan 21 22:28:02 fedora kernel: BUG: unable to handle page fault for address: 000000000000c030
Jan 21 22:28:02 fedora kernel: #PF: supervisor write access in kernel mode
Jan 21 22:28:02 fedora kernel: #PF: error_code(0x0002) - not-present page
Jan 21 22:28:02 fedora kernel: PGD 0 P4D 0 
Jan 21 22:28:03 fedora kernel: Oops: 0002 [#1] SMP NOPTI
Jan 21 22:28:03 fedora kernel: CPU: 7 PID: 2885 Comm: gnome-shell Tainted: G        W         5.15.15-200.fc35.x86_64 fwupd/fwupd#1
Jan 21 22:28:03 fedora kernel: Hardware name: LENOVO 20XWCTO1WW/20XWCTO1WW, BIOS N32ET75W (1.51 ) 12/02/2021
Jan 21 22:28:03 fedora kernel: RIP: 0010:igc_update_stats+0x8c/0x690 [igc]
Jan 21 22:28:03 fedora kernel: Code: 89 0c 24 4e 8b b4 eb 88 00 00 00 e8 ae f1 ff ff 8b 93 fc 02 00 00 48 8b 0c 24 85 d2 74 0e 48 8b b3 98 02 00 00 31 d2 4c 01 fe <89> 16 85 c0 74 10 89 c0 49 01 86 80 00 00 00 48 01 83 40 02 00 00
Jan 21 22:28:03 fedora kernel: RSP: 0018:ffffa25cc5897c98 EFLAGS: 00010206
Jan 21 22:28:03 fedora kernel: RAX: 00000000ffffffff RBX: ffff9008cb518980 RCX: 0000000000000000
Jan 21 22:28:03 fedora kernel: RDX: 0000000000000000 RSI: 000000000000c030 RDI: ffff900fff7e0a00
Jan 21 22:28:03 fedora kernel: RBP: ffff9008cb518c10 R08: 0000000000000000 R09: ffffa25cc5897aa8
Jan 21 22:28:03 fedora kernel: R10: ffffa25cc5897aa0 R11: ffffffff88f46028 R12: 0000000000000000
Jan 21 22:28:03 fedora kernel: R13: 0000000000000000 R14: ffff9008cb3cdd40 R15: 000000000000c030
Jan 21 22:28:03 fedora kernel: FS:  00007f3e74c2bd80(0000) GS:ffff900fff7c0000(0000) knlGS:0000000000000000
Jan 21 22:28:03 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 21 22:28:03 fedora kernel: CR2: 000000000000c030 CR3: 0000000166320003 CR4: 0000000000770ee0
Jan 21 22:28:03 fedora kernel: PKRU: 55555554
Jan 21 22:28:03 fedora kernel: Call Trace:
Jan 21 22:28:03 fedora kernel:  <TASK>
Jan 21 22:28:03 fedora kernel:  igc_get_stats64+0x7f/0x90 [igc]
Jan 21 22:28:03 fedora kernel:  dev_get_stats+0x59/0xc0
Jan 21 22:28:03 fedora kernel:  dev_seq_printf_stats+0x20/0xb0
Jan 21 22:28:03 fedora kernel:  dev_seq_show+0x10/0x30
Jan 21 22:28:03 fedora kernel:  seq_read_iter+0x2bf/0x4b0
Jan 21 22:28:03 fedora kernel:  seq_read+0xed/0x120
Jan 21 22:28:03 fedora kernel:  proc_reg_read+0x52/0xa0
Jan 21 22:28:03 fedora kernel:  vfs_read+0x92/0x190
Jan 21 22:28:03 fedora kernel:  ksys_read+0x4f/0xc0
Jan 21 22:28:03 fedora kernel:  do_syscall_64+0x38/0x90
Jan 21 22:28:03 fedora kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Jan 21 22:28:03 fedora kernel: RIP: 0033:0x7f3e7b0d978c
Jan 21 22:28:03 fedora kernel: Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 39 89 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 34 44 89 c7 48 89 44 24 08 e8 7f 89 f8 ff 48
Jan 21 22:28:03 fedora kernel: RSP: 002b:00007fffbb0ee1e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Jan 21 22:28:03 fedora kernel: RAX: ffffffffffffffda RBX: 000055ff98059220 RCX: 00007f3e7b0d978c
Jan 21 22:28:03 fedora kernel: RDX: 0000000000001000 RSI: 00007fffbb0ee2f0 RDI: 0000000000000042
Jan 21 22:28:03 fedora kernel: RBP: 00007f3e7b1d43a0 R08: 0000000000000000 R09: 0000000000000000
Jan 21 22:28:03 fedora kernel: R10: 0000000000001000 R11: 0000000000000246 R12: 00007fffbb0ee2f0
Jan 21 22:28:03 fedora kernel: R13: 0000000000000d68 R14: 00007f3e7b1d37a0 R15: 0000000000001000
Jan 21 22:28:03 fedora kernel:  </TASK>
Jan 21 22:28:03 fedora kernel: Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer xt_conntrack xt_MASQUERADE nf_conntrack_netlink nft_counter xt_addrtype nft_compat br_netfilter bridge stp llc nls_utf8 cifs cifs_arc4 rdma_cm iw_cm ib_cm ib_core cifs_md4 dns_resolver fscache netfs nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 overlay ip_set nf_tables nfnetlink qrtr ns bnep snd_ctl_led snd_soc_skl_hda_dsp snd_soc_intel_hda_dsp_common snd_soc_hdac_hdmi sunrpc snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_soc_dmic iTCO_wdt snd_sof_pci_intel_tgl intel_pmc_bxt pmt_telemetry mei_wdt mei_hdcp iTCO_vendor_support pmt_class snd_sof_intel_hda_common intel_rapl_msr soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof iwlmvm
Jan 21 22:28:03 fedora kernel:  snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus mac80211 snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp snd_hda_intel coretemp snd_intel_dspcfg libarc4 intel_cstate snd_intel_sdw_acpi intel_uncore pcspkr snd_usb_audio snd_hda_codec iwlwifi snd_usbmidi_lib think_lmi snd_hda_core firmware_attributes_class snd_rawmidi vfat i2c_i801 wmi_bmof i2c_smbus uvcvideo snd_hwdep snd_seq fat btusb mei_me snd_seq_device cfg80211 mei videobuf2_vmalloc btrtl snd_pcm squashfs idma64 btbcm videobuf2_memops snd_timer loop videobuf2_v4l2 btintel videobuf2_common bluetooth joydev videodev processor_thermal_device_pci_legacy processor_thermal_device mc ecdh_generic thunderbolt intel_pmt processor_thermal_rfim processor_thermal_mbox processor_thermal_rapl intel_rapl_common intel_soc_dts_iosf igen6_edac thinkpad_acpi nxp_nci_i2c nxp_nci nci ledtrig_audio platform_profile nfc snd rfkill soundcore
Jan 21 22:28:03 fedora kernel:  int3403_thermal int340x_thermal_zone soc_button_array acpi_pad intel_hid int3400_thermal acpi_tad acpi_thermal_rel sparse_keymap zram ip_tables dm_crypt trusted asn1_encoder hid_logitech_hidpp hid_logitech_dj hid_sensor_hub intel_ishtp_loader intel_ishtp_hid i915 hid_multitouch i2c_algo_bit ttm drm_kms_helper cec crct10dif_pclmul crc32_pclmul crc32c_intel drm nvme igc serio_raw intel_ish_ipc ghash_clmulni_intel intel_ishtp nvme_core ucsi_acpi typec_ucsi typec i2c_hid_acpi wmi i2c_hid video pinctrl_tigerlake ipmi_devintf ipmi_msghandler fuse
Jan 21 22:28:03 fedora kernel: CR2: 000000000000c030
Jan 21 22:28:03 fedora kernel: ---[ end trace 3bf280f689b0d892 ]---
Jan 21 22:28:03 fedora kernel: RIP: 0010:igc_update_stats+0x8c/0x690 [igc]
Jan 21 22:28:03 fedora kernel: Code: 89 0c 24 4e 8b b4 eb 88 00 00 00 e8 ae f1 ff ff 8b 93 fc 02 00 00 48 8b 0c 24 85 d2 74 0e 48 8b b3 98 02 00 00 31 d2 4c 01 fe <89> 16 85 c0 74 10 89 c0 49 01 86 80 00 00 00 48 01 83 40 02 00 00
Jan 21 22:28:03 fedora kernel: RSP: 0018:ffffa25cc5897c98 EFLAGS: 00010206
Jan 21 22:28:03 fedora kernel: RAX: 00000000ffffffff RBX: ffff9008cb518980 RCX: 0000000000000000
Jan 21 22:28:03 fedora kernel: RDX: 0000000000000000 RSI: 000000000000c030 RDI: ffff900fff7e0a00
Jan 21 22:28:03 fedora kernel: RBP: ffff9008cb518c10 R08: 0000000000000000 R09: ffffa25cc5897aa8
Jan 21 22:28:03 fedora kernel: R10: ffffa25cc5897aa0 R11: ffffffff88f46028 R12: 0000000000000000
Jan 21 22:28:03 fedora kernel: R13: 0000000000000000 R14: ffff9008cb3cdd40 R15: 000000000000c030
Jan 21 22:28:03 fedora kernel: FS:  00007f3e74c2bd80(0000) GS:ffff900fff7c0000(0000) knlGS:0000000000000000
Jan 21 22:28:03 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 21 22:28:03 fedora kernel: CR2: 000000000000c030 CR3: 0000000166320003 CR4: 0000000000770ee0
Jan 21 22:28:03 fedora kernel: PKRU: 55555554

After reboot of the system (and without resetting the Dock) there is no ethernet device of the Docking Station detected anymore, but my machine is fully usable (well besides having no ethernet device).

Only after cutting power to the docking station the Ethernet device is detected again. But once my system boots up again, it then freezes as described above.

Looking into fwupdmgr get-device gives this information (full output see further below)

Update Error: firmware update inhibited by [usi_dock] plugin

So I am not sure now if this is a problem of the Firmware itself or something connected to fwupd process itself.

boltctl shows this

$ boltctl
 ● Lenovo ThinkPad Thunderbolt 4 Dock
   ├─ type:          peripheral
   ├─ name:          ThinkPad Thunderbolt 4 Dock
   ├─ vendor:        Lenovo
   ├─ uuid:          001251fc-d4c2-8780-ffff-ffffffffffff
   ├─ generation:    USB4
   ├─ status:        authorized
   │  ├─ domain:     51042edf-d90f-8780-ffff-ffffffffffff
   │  ├─ rx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  ├─ tx speed:   40 Gb/s = 2 lanes * 20 Gb/s
   │  └─ authflags:  none
   ├─ authorized:    Fri 21 Jan 2022 21:30:37 UTC
   ├─ connected:     Fri 21 Jan 2022 21:30:37 UTC
   └─ stored:        Tue 18 Jan 2022 18:48:19 UTC
      ├─ policy:     iommu
      └─ key:        no

Expected behavior Firmware update does not have an error about being inhibited by usi_dock plugin.

fwupd version information Please provide the version of the daemon and client.

client version: 1.7.4
compile-time dependency versions
    gusb:   0.3.9

daemon version: 1.7.4

Please note how you installed it (apt, dnf, pacman, source, etc):

fwupd device information Please provide the output of the fwupd devices recognized in your system.

fwupdmgr get-devices --show-all-devices

├─ThinkPad Thunderbolt 4 Dock:
│     Device ID:          b28166ffadecd8d55b9c6d34e057ae848191deaa
│     Current version:    10.6
│     Vendor:             Lenovo (USB:0x17EF)
│     GUIDs:              2e4ffb60-b2e2-5f2f-9ea6-60931eed758b
│                         8d30b09f-bcc5-5379-bc65-9ccceeece1f4
│                         275e4695-9b4e-5263-835e-8681bec8cd1a
│     Device Flags:       • Updatable
│                         • System requires external power source
│                         • Device stages updates
│   
├─ThinkPad Thunderbolt 4 Dock:
│     Device ID:          b927c17354471ef8df740c693f2fbba8092a9baa
│     Current version:    30.00
│     Vendor:             Lenovo (THUNDERBOLT:0x0108, TBT:0x0108)
│     Update Error:       firmware update inhibited by [usi_dock] plugin
│     GUIDs:              f1bdfa07-026c-5973-89f9-b7bd8e42ff9d
│                         dc3a1b4e-176f-589a-949c-d058324bb6a7
│                         350ec5bb-25c0-543e-8d4c-72c30d2b8df0
│                         2e927863-03be-5488-aeba-b0b9b3895fb5
│     Device Flags:       • System requires external power source
│                         • Device stages updates
│                         • Updatable

Additional questions

hughsie commented 2 years ago

What firmware update version did you update to please. Was the package from the LVFS?

fbatschi commented 2 years ago

I updated via LVFS, yes.

I now updated the dock on a windows machine to the latest available firmware which is 1.0.07 for the dock and 35.00.06 for the TB4 and 1.73.1 for the I225 ethernet chip. The problem persist. As it only started after doing the update i assume it must be related to one of these firmwares.

Apparently not related to fwupd itself (although i am still not sure what the Update error means. How is the firmware update "inhibited" by the usi_dock plugin?

─ThinkPad Thunderbolt 4 Dock:
│     Device ID:          b927c17354471ef8df740c693f2fbba8092a9b11
│     Current version:    35.00
│     Vendor:             Lenovo (THUNDERBOLT:0x0108, TBT:0x0108)
│     Update Error:       firmware update inhibited by [usi_dock] plugin
hughsie commented 2 years ago

How is the firmware update "inhibited" by the usi_dock plugin

Ohh, this just means that the device isn't to be updated using the thunderbolt mechanism, and instead it should be updated using the USI protocol instead. That bit is working as designed, but I think the TBT4 firmware has the issue. I'll reassign and poke some people.

hughsie commented 2 years ago

@victor-cheng I think USI needs to be aware of this. @mrhpearson is this something you want to track too?

taisph commented 2 years ago

I am experiencing a similar issue. I've had to stop using the ethernet port in the dock as I get "BUG: soft lockup" and spontaneous reboots after the update.

mrhpearson commented 2 years ago

I've flagged to the dock team - but I don't think I'm seeing the issue on my dock which is interesting. Is there any special steps to reproduce the problem? Does it only happen on hotplug etc?

taisph commented 2 years ago

I've flagged to the dock team - but I don't think I'm seeing the issue on my dock which is interesting. Is there any special steps to reproduce the problem? Does it only happen on hotplug etc?

For me, it is reproducible shortly after a cold boot. The dock ethernet port is oddly dead/non-responsive after the initial laptop crash.

rpurdie commented 2 years ago

I'm having the same issue. I have the same ThinkPad Thunderbolt 4 Dock and whilst it works fine with my X280, if I try my new P1 Gen 4, I get the oops in the kernel logs as above with a "igc: Failed to read reg" message and then various oops follow and the machine ends up locking up requiring a hard reset. Firmware on the doc is as above. I did try with and without ethernet connected on the dock, no difference. If I reset the machine and don't reconnect the dock it will at least not crash but the dock doesn't work and it will as soon as I unplug/re-plug the dock. Booting with the dock attached doesn't matter, it crashes. Ubuntu 21.10 on both systems if that makes any difference.

taisph commented 2 years ago

It seems using fwupdmgr can crash the ethernet device and/or the igc driver on my Lenovo laptop (Ubuntu 20.04.4 LTS, HWE kernel 5.13)? I did a fwupdmgr get-devices which was immediately followed by a "igc PCIe device link lost error" in the syslog.

Apr  5 08:47:08 laptop dbus-daemon[2740]: [system] Activating via systemd: service name='org.freedesktop.fwupd' unit='fwupd.service' requested by ':1.164' (uid=0 pid=32908 comm="fwupdmgr get-devices " label="unconfined")
Apr  5 08:47:08 laptop systemd[1]: Starting Firmware update daemon...
Apr  5 08:47:09 laptop kernel: [  302.210443] hid-generic 0003:17EF:30B4.000A: hiddev0,hidraw0: USB HID v1.11 Device [Lenovo ThinkPad Thunderbolt 4 Dock MCU Contoller] on usb-0000:00:14.0-5.1/input0
Apr  5 08:47:09 laptop fwupd[32922]: 06:47:09:0260 FuPluginDfu          13d3:56fb is missing download capability
Apr  5 08:47:09 laptop fwupd[32922]: 06:47:09:0816 FuEngine             failed to update history database: device ID fc06b502b715a313bd69fcc014c4935e3ea6aca9 was not found
Apr  5 08:47:09 laptop dbus-daemon[2740]: [system] Successfully activated service 'org.freedesktop.fwupd'
Apr  5 08:47:09 laptop systemd[1]: Started Firmware update daemon.
Apr  5 08:47:09 laptop fwupd[32922]: 06:47:09:0863 FuEngine             failed to record HSI attributes: failed to get historical attr: json-glib version too old
Apr  5 08:47:10 laptop kernel: [  303.318854] igc 0000:49:00.0 enp73s0: PCIe link lost, device now detached
Apr  5 08:47:10 laptop kernel: [  303.318872] ------------[ cut here ]------------
Apr  5 08:47:10 laptop kernel: [  303.318874] igc: Failed to read reg 0x5b50!
Apr  5 08:47:10 laptop kernel: [  303.318961] WARNING: CPU: 12 PID: 3924 at drivers/net/ethernet/intel/igc/igc_main.c:5317 igc_rd32+0x85/0x90 [igc]
[...]
Apr  5 08:47:10 laptop kernel: [  303.319372] ---[ end trace ea162ab866ed960b ]---
Apr  5 08:47:10 laptop kernel: [  303.319988] BUG: unable to handle page fault for address: 0000000000005b50
Apr  5 08:47:10 laptop kernel: [  303.319993] #PF: supervisor read access in kernel mode
Apr  5 08:47:10 laptop kernel: [  303.319998] #PF: error_code(0x0000) - not-present page
Apr  5 08:47:10 laptop kernel: [  303.320011] PGD 0 P4D 0 
Apr  5 08:47:10 laptop kernel: [  303.320014] Oops: 0000 [#1] SMP NOPTI
Apr  5 08:47:10 laptop kernel: [  303.320019] CPU: 12 PID: 3924 Comm: kworker/12:3 Tainted: P        W  O      5.13.0-39-generic #44~20.04.1-Ubuntu
Apr  5 08:47:10 laptop kernel: [  303.320023] Hardware name: LENOVO 20YQCTO1WW/20YQCTO1WW, BIOS N37ET37W (1.18 ) 12/24/2021
[...]
rpurdie commented 2 years ago

For what it is worth I've reproduced the lock up with the dock with 5.13, 5.16.18, 5.17.1 and 5.18rc1 kernels.

rpurdie commented 2 years ago

I noticed a lot of "BAR 13: no space for [io size 0x1000] type messages in dmesg. It looked like the ethernet driver wasn't getting a window for it's io ports but trying to access them anyway in update_stats(), hence crashing the system. I found adding pci=hpiosize=8192 to the kernel command line allowed things to work (there were 5 windows of 0x1000 being requested). I suspect the IO port requirements in the new firmware has differing ioport requirements, hence why it causes issues. Whether it is correct and really needs those ports, I don't know.

fcrozat commented 2 years ago

I noticed a lot of "BAR 13: no space for [io size 0x1000] type messages in dmesg. It looked like the ethernet driver wasn't getting a window for it's io ports but trying to access them anyway in update_stats(), hence crashing the system. I found adding pci=hpiosize=8192 to the kernel command line allowed things to work (there were 5 windows of 0x1000 being requested). I suspect the IO port requirements in the new firmware has differing ioport requirements, hence why it causes issues. Whether it is correct and really needs those ports, I don't know.

I can confirm using pci=hpiosize=8192 allows the dock ethernet to be recognized properly on 5.17.1 openSUSE Tumbleweed kernel

mark-beeby commented 2 years ago

This is an issue for me too, seeing much the same as the above users, but on PopOS 21.10, which uses LVFS. I've added pci=hpiosize=8192 to my kernel parameters but that hasn't seemingly changed the behaviour for me. That said Pop uses kernelstub rather than grub, and it's not entirely clear if I'm doing it right.

firmware update inhibited by [usi_dock] plugin is present, as are very similar logs in journalctl per the reporter. When I boot with it all plugged in, the ethernet adapter is inactive and I may or may not get a fairly locked up machine thereafter. If I boot fully and then plug the dock in, ethernet shows up but is not functional, and the machine again likely locks up shortly thereafter. If I boot without ethernet it's a game of chance over whether everything else will work (USB peripherals, DP etc).

Any other suggestions here? I've had to purchase an Anker dock and completely remove this from my setup for now...

mark-beeby commented 2 years ago

@mrhpearson I don't suppose you've had any update here?

fcrozat commented 2 years ago

I don't have the issue anymore with 5.18.2 kernel (but there was also a firmware update on P15 Gen2 laptop, might be related)

irishgordo commented 2 years ago

This is an issue for me too, seeing much the same as the above users, but on PopOS 21.10, which uses LVFS. I've added pci=hpiosize=8192 to my kernel parameters but that hasn't seemingly changed the behaviour for me. That said Pop uses kernelstub rather than grub, and it's not entirely clear if I'm doing it right.

firmware update inhibited by [usi_dock] plugin is present, as are very similar logs in journalctl per the reporter. When I boot with it all plugged in, the ethernet adapter is inactive and I may or may not get a fairly locked up machine thereafter. If I boot fully and then plug the dock in, ethernet shows up but is not functional, and the machine again likely locks up shortly thereafter. If I boot without ethernet it's a game of chance over whether everything else will work (USB peripherals, DP etc).

Any other suggestions here? I've had to purchase an Anker dock and completely remove this from my setup for now...

I'm currently running 22.04 Pop! OS, seeing that present with the 5.17.5-7*-generic kernel. The:

firmware update inhibited by [usi_dock] plugin

and the:

│ │     Update Error:     Use the MCU to update the DMC device

from the Dock Management Controller Information.

I do note that while my Ethernet via the dock isn't currently working I can mostly always (sometimes with a bit of a dance) get a fix out for running 2 4k displays off the dock via: https://github.com/pop-os/pop/issues/2417#issuecomment-1128163461 (within Pop! OS)

RaphaelJenni commented 2 years ago

Same errors for me on Ubuntu 22.04 with Kernel 5.15.0-39.42-generic 5.15.35. Errors: firmware update inhibited by [usi_dock] plugin and Update Error: Use the MCU to update the DMC device.

I have a Thinkpad X1 Extreme Gen4 and I'm running fwupd version 1.7.5.

The problems I'm experiencing are:

  1. the system hangs on startup when connected to the dock. Can't execute any commands, and can't close windows, only a forceful shutdown, and a reboot without the dock connected makes it usable again.
  2. A phantom monitor appears when no monitor is connected to the dock.

Not sure how much of this is related to fwupd and how much is related to the actual driver itself.

Related issues:

boxallw commented 1 year ago

How do I find what FW version the dock is running?

Still no workaround to get the dock working with Fedora on a Thinkpad X1 Carbon Gen 9?

mark-beeby commented 1 year ago

Latest kernel 5.18.10-76051810-generic (Pop-OS) has not fixed the issues with this dock, now using fwupd 1.8.0. The following remains present in fwupdmgr get-devices --show-all-devices:

Use the MCU to update the DMC device

I can't seem to extract much as the machine locks up very swiftly.

boxallw commented 1 year ago

New firmware for the dock came out. No change. Still a full lockup after logging in.

mark-beeby commented 1 year ago

Finally managed to upgrade the firmware via a Windows 10 machine using the Lenovo Docker Manager, sadly it still hasn't fixed the locking up of my Thinkpad X1 Extreme Gen4 running Pop-OS. Is there any chance of getting some help with this issue @hughsie @mrhpearson?

boxallw commented 1 year ago

As above. Same result.

boxallw commented 1 year ago

Tried again today after a dnf update and all is working perfectly. Except when I fire up a VM and the whole machine shuts down like a massive power failure.

Lenovo is just rubbish these days.

mark-beeby commented 1 year ago

Had a breakthrough here. Whilst the driver update has been left utterly broken and remains unsupported by Lenovo since this thing was released, I have got the dock working (including ethernet) by changing a bios setting:-

Bios -> Config -> Thunderbolt 4 -> PCIe Tunelling (disable this)

RaphaelJenni commented 1 year ago

Had a breakthrough here. Whilst the driver update has been left utterly broken and remains unsupported by Lenovo since this thing was released, I have got the dock working (including ethernet) by changing a bios setting:-

Bios -> Config -> Thunderbolt 4 -> PCIe Tunelling (disable this)

Can confirm. No crashes. But for me no screen is detected (via HDMI), does that work for you @mark-beeby ?

Edit: After a reboot, the screen was detected with no issues. Edit 2: Switching to a different screen (ultrawide) while the computer is running does not detect the screen correctly. A reboot fixed the detection part again, but only the top corner is visible (this one could also be a xserver/ubuntu issue).

thattolleyguy commented 1 year ago

Worked for me as well

mark-beeby commented 1 year ago

@RaphaelJenni saw you'd gotten screens mostly working, HDMI and DisplayPort (which i normally use) both now work fine here, just to confirm!

fbatschi commented 1 year ago

Can confirm as well. No more crashes, ethernet on dock and HDMI works fine with PCIe Tunneling switched off..

superiorpyro commented 1 year ago

Unfortunately no luck here. Found the BIOS setting and set it to off, shut down and powered the dock off too... Upon reboot, still no live ethernet port. LENOVO PLEASE ADDRESS THIS ISSUE.

MatthiasLohr commented 1 year ago

Can confirm as well. No more crashes, ethernet on dock and HDMI works fine with PCIe Tunneling switched off..

Same here.

irishgordo commented 1 year ago

Had a breakthrough here. Whilst the driver update has been left utterly broken and remains unsupported by Lenovo since this thing was released, I have got the dock working (including ethernet) by changing a bios setting:-

Bios -> Config -> Thunderbolt 4 -> PCIe Tunelling (disable this)

This definitely helped - ethernet on dock now works again and can run two 4k & integrated display on p15 thinkpad much more easily than before

taisph commented 1 year ago

Had a breakthrough here. Whilst the driver update has been left utterly broken and remains unsupported by Lenovo since this thing was released, I have got the dock working (including ethernet) by changing a bios setting:-

Bios -> Config -> Thunderbolt 4 -> PCIe Tunelling (disable this)

Disabling this does seem to workaround the kernel hang I was experiencing when running fwupdmgr. The dock ethernet port also works reliably now. But I cannot run with 3 displays any more. It will cold boot with all 3 displays but after a while, the display connected to the dock DP port above the HDMI port, will start to drop out and eventually shut off completely. Disabling the laptop display/closing the lid seems to make the two external displays work reliably again. Could be an Nvidia driver issue though.

taisph commented 1 year ago

Disabling this does seem to workaround the kernel hang I was experiencing when running fwupdmgr. The dock ethernet port also works reliably now. But I cannot run with 3 displays any more. It will cold boot with all 3 displays but after a while, the display connected to the dock DP port above the HDMI port, will start to drop out and eventually shut off completely. Disabling the laptop display/closing the lid seems to make the two external displays work reliably again. Could be an Nvidia driver issue though.

Scratch that. I'm getting that drop out and shut off even with just the two monitors. 😞

MichalMaler commented 1 year ago

Hello. Please, if I can ask..maybe I bricked my Dock:

• ~/ fwupdmgr get-updates

Devices with no available firmware updates: • ThinkPad Thunderbolt 4 Dock • (null) • Integrated Camera • Thunderbolt host controller • UEFI Device Firmware • UEFI Device Firmware • UEFI Device Firmware Devices with the latest available firmware version: • Embedded Controller • Intel Management Engine • KXG6AZNV512G TOSHIBA • Prometheus • Prometheus IOTA Config • System Firmware • UEFI dbx Idle… [***] Successfully uploaded 1 report LENOVO 20TJS2F45T │ └─ThinkPad Thunderbolt 4 Dock: │ Device ID: 691357701be9715ae7179d965f4fbfefdeb6e828 │ Current version: 10.7 │ Vendor: Lenovo (USB:0x17EF) │ Serial Number: 1S40B0ZDZE0FMC │ Update State: Failed │ Update Error: failed to write chunk 0x1c11transfer failed │ Last modified: 2022-11-30 09:09 │ GUIDs: 8d30b09f-bcc5-5379-bc65-9ccceeece1f4 ← USB\VID_17EF&PID_30B4 │ 2e4ffb60-b2e2-5f2f-9ea6-60931eed758b ← USB\VID_17EF&PID_30B4&REV_0100 │ 275e4695-9b4e-5263-835e-8681bec8cd1a ← USB\VID_17EF&PID_30B4&CID_40B0 │ Device Flags: • System requires external power source │ • Supported on remote server │ • Device stages updates │ • Updatable │ • Signed Payload │ └─ThinkPad Thunderbolt 4 Dock: New version: 10.13 Remote ID: lvfs Release ID: 16438 Summary: Firmware for ThinkPad Thunderbolt 4 Dock License: Proprietary Size: 6.5 MB Created: 2022-10-28 Urgency: High Vendor: Lenovo Release Flags: • Is upgrade Description:
Before continuing, ensure the following:

When I do the:

• ~/ sudo fwupdmgr update --force Devices with no available firmware updates: • ThinkPad Thunderbolt 4 Dock • (null) • Integrated Camera ThinkPad Thunderbolt 4 Dock is not currently updatable: failed to write chunk 0x1c11transfer failed

The way I broke this is that I thought it is stuck (after waiting 20 minutes with no progress), so I ended up the command using CTRL+C and wanted to try that again.

Is there how to reset/restore it and then do the new FW upgrade? Thank you so much.

Edit: I trieud to turn of the Dock and unplug it from electricity, then apply the commads again. This is the result:

• ~/ fwupdmgr update --force Devices with no available firmware updates: • ThinkPad Thunderbolt 4 Dock • Integrated Camera • ThinkPad Thunderbolt 4 Dock • Thunderbolt host controller • UEFI Device Firmware • UEFI Device Firmware • UEFI Device Firmware Devices with the latest available firmware version: • Embedded Controller • Intel Management Engine • KXG6AZNV512G TOSHIBA • Prometheus • Prometheus IOTA Config • System Firmware • UEFI dbx • ~/

So maybe it is ok now?

mrhpearson commented 1 year ago

As a note - I did not know they had finally released that FW update. I've been chasing them for months on this and then they don't tell me it has been delivered...grumble.

I'll check with the FW team on that specific error - but what happens if you do 'fwupdmgr get-devices' - does it report the updated FW version? And is everything working correctly?

Mark

pfactum commented 1 year ago

I've switched off PCI-E passthrough, enabled LVFS testing repo and I think I've managed to update the dock firmware to v10.13:

├─ThinkPad Thunderbolt 4 Dock:
│ │   Device ID:          xxx
│ │   Previous version:   10.7
│ │   Update Error:       Device has been removed, Device requires AC power to be connected
│ │   GUID:               xxx
│ │   Device Flags:       • System requires external power source
│ │                       • Supported on remote server
│ │                       • Device stages updates
│ │                       • Updatable
│ │                       • Signed Payload
│ │ 
│ └─ThinkPad Thunderbolt 4:
│       New version:      10.13
│       Remote ID:        lvfs
│       Release ID:       16438
│       Summary:          Firmware for ThinkPad Thunderbolt 4 Dock
│       Licence:          Proprietary
│       Size:             6,5 MB
│       Created:          2022-10-28
│       Urgency:          High
│       Vendor:           Lenovo
│       Description:      
│       Before continuing, ensure the following:
│       
│       • Your computer battery life if over 25%.
│       • Do not unplug the dock during update.
│       • Do not put your computer into sleep or hibernate mode during update.
│       • This update typically may take up to 15 minutes depending on your notebook and dock firmware version.
│       • Please ensure fwupd version is 1.8.6 or above.

When I switch PCI-E passthrough back however the issue with the igc NIC persists.

Is this the latest FW available? Anything else that can be tried?

Currently, for me, fwupdmgr get-updates says that I've already updated everything I could:

Devices with no available firmware updates: 
 • 0000:00:1f.5
 • Integrated Camera
 • Intel Management Engine
 • UEFI Device Firmware
 • UEFI Device Firmware
 • UEFI Device Firmware
 • UEFI Device Firmware
 • UEFI Device Firmware
 • UEFI Device Firmware
 • USB3.0 Hub
Devices with the latest available firmware version:
 • Embedded Controller
 • Intel Management Engine
 • MZVL2512HCJQ-00BL7
 • Prometheus
 • Prometheus IOTA Config
 • System Firmware
 • ThinkPad Thunderbolt 4 Dock
 • UEFI dbx
No updates available
mrhpearson commented 1 year ago

Yep - that's the latest version.

For the multiple displays - are you using DP on more than one of them? If you are there are some patches Intel have landed in 6.3 that enable DSC support and improve things. I've not had a chance to try them myself yet but I've heard it makes things better (but not perfect).

I'd completely lost track of this thread. We have a team focusing on Linux dock issues - I'll flag the IGC + PCIe tunnel issue to get their feedback. I'm assuming it works well on Windows?

pfactum commented 1 year ago

Yep - that's the latest version.

OK, thanks for confirming this.

We have a team focusing on Linux dock issues - I'll flag the IGC + PCIe tunnel issue to get their feedback. I'm assuming it works well on Windows?

I cannot verify if it works on Windows, sorry.

rpurdie commented 1 year ago

Yep - that's the latest version.

For the multiple displays - are you using DP on more than one of them? If you are there are some patches Intel have landed in 6.3 that enable DSC support and improve things. I've not had a chance to try them myself yet but I've heard it makes things better (but not perfect).

I'd completely lost track of this thread. We have a team focusing on Linux dock issues - I'll flag the IGC + PCIe tunnel issue to get their feedback.

In case it helps I can summarise where I got to debugging things. When I connect the dock, I was seeing a lot of "BAR 13: no space for [io size 0x1000] type messages in dmesg. It looked like the ethernet driver wasn't getting a window for it's io ports but trying to access them anyway in update_stats(), hence crashing the system.

I found adding pci=hpiosize=8192 to the kernel commandline meant it could allocate the io ports and then things have worked ok.

I suspect in a firmware update, the io port requirements were increased which is why it used to work but stopped.

There are two issues, a) why is the kernel driver accessing io ports it failed to allocate and b) does it really need this size of io ports in the first place?

The latter could be fixed in dock firmware, the former is a bug in the kernel.

sipi58 commented 1 year ago

I can confirm the same problem under Windows 10. Windows and even openSUSE do not see the TB4's ethernet port. I restart the P15 Gen 2 laptop to no avail. The ethernet is restored only when I unplug the TB4 power connector and plug it in again.

The next time you plug in or restart the laptop, you will not see the TB4 ethernet port again.

I use Lenovo's Commercial Vantage for updates under Windows and there is currently no update for the laptop or the TB4 either.

PCIe tunneling is disabled in the BIOS it doesn't work though.

I think that one of the recent Lenovo updates caused the problem, but I don't understand why it can't be solved and a fix issued... The point of the docking station is to charge and connect the laptop to the network, so it cannot be used...

Does anyone have a solution to this problem? Thanks for the help.

pfactum commented 1 year ago

I think those are two different symptoms.

When PCI-E passthrough is disabled, the NIC from the dock is seen as USB-connected using the cdc_ether driver. It sometimes happens that once the machine is booted, this NIC device is not seen as present at all, and the cable to the dock has to be replugged in order for NIC to appear.

When PCI-E passthrough is enabled, the NIC is seen as attached via PCI-E. In this case it uses the igc driver, and the issues with it falling off as described above occur.

I'm not sure if these two problems share the same root cause underneath, but the symptoms and conditions to trigger them are different.

pfactum commented 1 year ago

pci=hpiosize=8192

Has anybody tried setting this value even bigger with the latest dock FW, and PCI-E passthrough enabled?

MatthiasLohr commented 1 year ago

Today I installed a firmware update for the dock (via Windows/Lenovo System Update) and for now the problem seemed to be gone... can anyone confirm this?

pfactum commented 1 year ago

What FW version have you got?

MatthiasLohr commented 1 year ago

Good question. Is it 10.15?

├─ThinkPad Thunderbolt 4 Dock:
│ │   Device ID:          b28166ffadecd8d55b9c6d34e057ae848191debf
│ │   Current version:    10.15
│ │   Vendor:             Lenovo (USB:0x17EF)
│ │   Serial Number:      xxxx
│ │   GUIDs:              xxx
│ │   Device Flags:       • Updatable
│ │                       • System requires external power source
│ │                       • Device stages updates
│ │                       • Signed Payload
│ │ 
│ └─Dock Management Controller Information:
│       Device ID:        024e42ddfc4717530c7d01f01a826f93fd2a81c0
│       Current version:  10.15
│       Vendor:           Lenovo (USB:0x17EF)
│       Serial Number:    xxxxx
│       Update Error:     Use the MCU to update the DMC device
│       GUIDs:            xxx

Anyway... just while replying to this, it crashed again - so, no, not fixed. Sorry for false claim.

andrejpodzimek commented 1 year ago

It’s quite surprising how this only affects the latest Gen 10 laptop in my case. Gen 7 and Gen 9 are unaffected.

Tried all those random hacks like pcie_port_pm=off pcie_aspm.policy=performance pci=hpiosize=8192, just in case it has something to do with PM or address space mappings. (Also tried RUNTIME_PM_DISABLE="49:00.0 79:00.0" in /etc/tlp.conf (the addresses pointing at same NIC, depending on which TB port the dock is in) — to no avail.)

I wish there was a way to switch (also) the X1 Gen 10 to “the USB way” (which <= Gen 9 was using), but OTOH, it would be even better if the 2.5 Gb/s PCIe access could work…

An example stack trace from my machine follows. The register numbers are different (almost) each time. The error is mostly -5; this -13 case occurs less frequently:

May 07 05:12:03 kernel: Intel(R) 2.5G Ethernet Linux Driver
May 07 05:12:03 kernel: Copyright(c) 2018 Intel Corporation.
May 07 05:12:03 kernel: igc 0000:79:00.0: enabling device (0000 -> 0002)
May 07 05:12:03 kernel: igc 0000:79:00.0: PTM enabled, 4ns granularity
May 07 05:12:03 kernel: igc 0000:79:00.0 (unnamed net_device) (uninitialized): PCIe link lost, device now detached
May 07 05:12:03 kernel: ------------[ cut here ]------------
May 07 05:12:03 kernel: igc: Failed to read reg 0x10!
May 07 05:12:03 kernel: WARNING: CPU: 0 PID: 3499 at drivers/net/ethernet/intel/igc/igc_main.c:6439 igc_rd32+0x8d/0xa0 [igc]
May 07 05:12:03 kernel: Modules linked in: igc(+) snd_seq_dummy snd_hrtimer snd_seq rfcomm ccm cmac algif_hash algif_skcipher af_alg uas uvcvideo videobuf2_vmalloc uvc snd_usb_audio videobuf2_memops snd_usbmidi_lib videobuf2_v4l2 snd_rawmidi videobuf2_common snd_seq_device hid_sen>
May 07 05:12:03 kernel:  snd_pcm_dmaengine snd_hda_intel intel_tcc_cooling iwlmvm x86_pkg_temp_thermal snd_intel_dspcfg intel_powerclamp snd_intel_sdw_acpi coretemp iTCO_wdt processor_thermal_device_pci ov2740 mei_pxp mei_hdcp kvm_intel mei_wdt processor_thermal_device pmt_telemet>
May 07 05:12:03 kernel:  acpi_thermal_rel acpi_tad intel_skl_int3472_discrete sparse_keymap mac_hid pkcs8_key_parser crypto_user loop fuse ip_tables x_tables dm_crypt cbc encrypted_keys trusted asn1_encoder tee xxhash_generic btrfs blake2b_generic xor raid6_pq libcrc32c crc32c_gen>
May 07 05:12:03 kernel: CPU: 0 PID: 3499 Comm: modprobe Tainted: P           OE      6.3.1-arch1-1 #1 2f4443c3fa3529b1ac13dc02f36f7de43ade3ecd
May 07 05:12:03 kernel: Hardware name: LENOVO 21CBCTO1WW/21CBCTO1WW, BIOS N3AET72W (1.37 ) 03/02/2023
May 07 05:12:03 kernel: RIP: 0010:igc_rd32+0x8d/0xa0 [igc]
May 07 05:12:03 kernel: Code: 48 c7 c6 f8 a7 0f c2 e8 41 87 00 d0 48 8b bb 28 ff ff ff e8 75 66 c0 cf 84 c0 74 bc 89 ee 48 c7 c7 20 a8 0f c2 e8 03 a8 5c cf <0f> 0b eb aa b8 ff ff ff ff c3 cc cc cc cc 0f 1f 44 00 00 90 90 90
May 07 05:12:03 kernel: RSP: 0018:ffffb57ec32a7b80 EFLAGS: 00010282
May 07 05:12:03 kernel: RAX: 0000000000000000 RBX: ffff8f7f0f202c60 RCX: 0000000000000027
May 07 05:12:03 kernel: RDX: ffff8f85bf621688 RSI: 0000000000000001 RDI: ffff8f85bf621680
May 07 05:12:03 kernel: RBP: 0000000000000010 R08: 0000000000000000 R09: ffffb57ec32a7a10
May 07 05:12:03 kernel: R10: 0000000000000003 R11: ffffffff934ca208 R12: ffff8f7f0f202000
May 07 05:12:03 kernel: R13: ffff8f7f0f2029c0 R14: ffff8f7f0f202000 R15: ffff8f7f0f202c60
May 07 05:12:03 kernel: FS:  00007fd2778ac740(0000) GS:ffff8f85bf600000(0000) knlGS:0000000000000000
May 07 05:12:03 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 07 05:12:03 kernel: CR2: 00007fd276dda000 CR3: 0000000146304004 CR4: 0000000000f70ef0
May 07 05:12:03 kernel: PKRU: 55555554
May 07 05:12:03 kernel: Call Trace:
May 07 05:12:03 kernel:  <TASK>
May 07 05:12:03 kernel:  igc_get_invariants_base+0x9a/0x270 [igc 7a87e4084baf010479bbf727709d93426faa2229]
May 07 05:12:03 kernel:  igc_probe+0x2e2/0x970 [igc 7a87e4084baf010479bbf727709d93426faa2229]
May 07 05:12:03 kernel:  local_pci_probe+0x42/0xa0
May 07 05:12:03 kernel:  pci_device_probe+0xc1/0x260
May 07 05:12:03 kernel:  ? sysfs_do_create_link_sd+0x6e/0xe0
May 07 05:12:03 kernel:  really_probe+0x19b/0x3e0
May 07 05:12:03 kernel:  ? __pfx___driver_attach+0x10/0x10
May 07 05:12:03 kernel:  __driver_probe_device+0x78/0x160
May 07 05:12:03 kernel:  driver_probe_device+0x1f/0x90
May 07 05:12:03 kernel:  __driver_attach+0xd2/0x1c0
May 07 05:12:03 kernel:  bus_for_each_dev+0x85/0xd0
May 07 05:12:03 kernel:  bus_add_driver+0x116/0x220
May 07 05:12:03 kernel:  driver_register+0x59/0x100
May 07 05:12:03 kernel:  ? __pfx_init_module+0x10/0x10 [igc 7a87e4084baf010479bbf727709d93426faa2229]
May 07 05:12:03 kernel:  do_one_initcall+0x5a/0x240
May 07 05:12:03 kernel:  do_init_module+0x4a/0x200
May 07 05:12:03 kernel:  __do_sys_init_module+0x17f/0x1b0
May 07 05:12:03 kernel:  do_syscall_64+0x5d/0x90
May 07 05:12:03 kernel:  ? exc_page_fault+0x7c/0x180
May 07 05:12:03 kernel:  entry_SYSCALL_64_after_hwframe+0x72/0xdc
May 07 05:12:03 kernel: RIP: 0033:0x7fd277321f9e
May 07 05:12:03 kernel: Code: 48 8b 0d bd ed 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8a ed 0c 00 f7 d8 64 89 01 48
May 07 05:12:03 kernel: RSP: 002b:00007ffd132da098 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
May 07 05:12:03 kernel: RAX: ffffffffffffffda RBX: 000055e599c19c10 RCX: 00007fd277321f9e
May 07 05:12:03 kernel: RDX: 000055e59825fcb2 RSI: 0000000000064aee RDI: 00007fd276d76010
May 07 05:12:03 kernel: RBP: 000055e59825fcb2 R08: 0000000000085000 R09: 0000000000000000
May 07 05:12:03 kernel: R10: 0000000000017701 R11: 0000000000000246 R12: 0000000000040000
May 07 05:12:03 kernel: R13: 000055e599c19da0 R14: 0000000000000000 R15: 000055e599c1a090
May 07 05:12:03 kernel:  </TASK>
May 07 05:12:03 kernel: ---[ end trace 0000000000000000 ]---
May 07 05:12:50 kernel: igc: probe of 0000:79:00.0 failed with error -13
mrhpearson commented 1 year ago

We believe this is vPro related...don't have much update to share (still investigating) but only the vPro enabled systems will access the igc networking device (apparently this is by dock design).

Can you confirm your C7 and C9 machines are non-vPro whereas you C10 is vPro?

pfactum commented 1 year ago

My T14s Gen 2 where the issue is seen is vPro.

mrhpearson commented 1 year ago

Try disabling AMT in the BIOS - I think that will kick the dock over to using the Realtek NIC and avoid the crash.

We still need to figure out what is going on with the IGC device...but from experience trying to untangle anything vPro related on Linux that can be quite painful.

taisph commented 1 year ago

Try disabling AMT in the BIOS - I think that will kick the dock over to using the Realtek NIC and avoid the crash.

We still need to figure out what is going on with the IGC device...but from experience trying to untangle anything vPro related on Linux that can be quite painful.

I can confirm disabling AMT switches to a Realtek driver, r81xx. Have been running with that for a month or two now.

The dock/laptop still loses the dock ethernet every now and then, and I usually have to pull the power from the dock to get it back. I think it usually happens after a suspend/resume cycle. But no crashes or CPU hangs. Dock firmware is stuck at 10.5 btw. No dock updates on Ubuntu 22.04.