NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.34k stars 14.3k forks source link

When reconnecting Thunderbolt dock, I get a kernel stacktrace in the logs #201661

Open miniBill opened 2 years ago

miniBill commented 2 years ago

Describe the bug

When reconnecting my Thunderbolt dock (CalDigit TS4) I get a stack trace in dmesg, and the USB hub doesn't work (the displayport and the hdmi adapter connected to the thunderbolt port both work).

Steps To Reproduce

Steps to reproduce the behavior:

  1. Disconnect the dock
  2. Reconnect the dock

Expected behavior

The hub gets recognized and everything works.

Screenshots

[70712.343724] thunderbolt 0000:04:00.0: can't change power state from D3cold to D0 (config space inaccessible)
[70712.343783] ------------[ cut here ]------------
[70712.343784] thunderbolt 0000:04:00.0: interrupt for TX ring 0 is already enabled
[70712.343812] WARNING: CPU: 3 PID: 98115 at drivers/thunderbolt/nhi.c:111 ring_interrupt_active+0x1d5/0x240 [thunderbolt]
[70712.343819] Modules linked in: snd_seq_midi snd_seq_midi_event xt_MASQUERADE xt_mark nft_chain_nat nf_nat snd_seq_dummy snd_hrtimer ccm qrtr ns hid_logitech_hidpp uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev hid_logitech_dj snd_usbmidi_lib joydev mousedev mc input_leds af_packet hid_generic btusb btrtl btbcm btintel usbhid bluetooth hid ecdh_generic ecc ip6_tables xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_rpfilter ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog xt_tcpudp nft_compat nft_counter msr nf_tables libcrc32c nfnetlink sch_fq_codel uinput nvidia_uvm(PO) ctr atkbd libps2 serio loop tap macvlan bridge stp llc snd_sof_pci_intel_cnl snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel soundwire_generic_allocation soundwire_cadence intel_rapl_msr tun intel_rapl_common snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof mdev vfio_iommu_type1 vfio soundwire_bus i915 iwlmvm snd_soc_skl snd_soc_sst_ipc
[70712.343844]  snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi mac80211 i2c_designware_platform i2c_designware_core snd_soc_core snd_hda_codec_realtek intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp mei_hdcp eeepc_wmi nvidia_drm(PO) asus_wmi snd_hda_codec_generic iTCO_wdt platform_profile nvidia_modeset(PO) battery intel_pmc_bxt sparse_keymap ledtrig_audio snd_compress coretemp ee1004 watchdog evdev crc32_pclmul libarc4 wmi_bmof intel_wmi_thunderbolt mxm_wmi ttm led_class ac97_bus ghash_clmulni_intel mac_hid snd_hda_codec_hdmi snd_pcm_dmaengine nls_iso8859_1 nvidia(PO) iwlwifi aesni_intel cec snd_hda_intel nls_cp437 snd_intel_dspcfg libaes snd_intel_sdw_acpi crypto_simd snd_hda_codec cmdlinepart cryptd vfat intel_spi_pci rapl intel_lpss_pci fat intel_cstate intel_lpss cfg80211 snd_hda_core intel_spi drm_kms_helper intel_gtt deflate intel_uncore igc spi_nor snd_hwdep mei_me i2c_algo_bit fb_sys_fops idma64 ptp syscopyarea virt_dma thunderbolt sysfillrect mtd
[70712.343867]  efi_pstore snd_pcm mei pps_core sysimgblt rfkill mfd_core thermal fan i2c_i801 i2c_smbus wmi video tiny_power_button intel_pmc_core acpi_pad acpi_tad button kvm_intel kvm drm irqbypass snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore agpgart fuse backlight i2c_core pstore configfs efivarfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 raid1 sd_mod md_mod ahci libahci xhci_pci xhci_pci_renesas libata xhci_hcd usbcore nvme scsi_mod nvme_core crc32c_intel rtc_cmos t10_pi crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common scsi_common usb_common dm_snapshot dm_bufio dm_mod
[70712.343888] CPU: 3 PID: 98115 Comm: kworker/3:2 Tainted: P        W  O      5.15.78 #1-NixOS
[70712.343890] Hardware name: ASUS System Product Name/ROG STRIX Z490-E GAMING, BIOS 0707 07/21/2020
[70712.343891] Workqueue: pm pm_runtime_work
[70712.343894] RIP: 0010:ring_interrupt_active+0x1d5/0x240 [thunderbolt]
[70712.343899] Code: 00 00 00 44 89 44 24 04 e8 48 8d f3 ef 44 8b 44 24 04 4d 89 f1 4c 89 e1 48 89 c6 4c 89 fa 48 c7 c7 28 36 ee c0 e8 67 11 1c f0 <0f> 0b e9 17 ff ff ff 0f b6 43 78 d3 e0 09 c7 e9 db fe ff ff 44 03
[70712.343899] RSP: 0018:ffffb612460fbc78 EFLAGS: 00010086
[70712.343900] RAX: 0000000000000000 RBX: ffff9b8c86002180 RCX: 0000000000000027
[70712.343901] RDX: ffff9b93fc8dc648 RSI: 0000000000000001 RDI: ffff9b93fc8dc640
[70712.343902] RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffb612460fbaa8
[70712.343902] R10: ffffb612460fbaa0 R11: ffffffffb1f38da8 R12: ffffffffc0ee21db
[70712.343903] R13: 0000000000038200 R14: ffffffffc0ee21ca R15: ffff9b8c81ef6030
[70712.343904] FS:  0000000000000000(0000) GS:ffff9b93fc8c0000(0000) knlGS:0000000000000000
[70712.343905] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[70712.343905] CR2: 00002b0e93c3c128 CR3: 0000000328106004 CR4: 00000000007706e0
[70712.343906] PKRU: 55555554
[70712.343906] Call Trace:
[70712.343908]  <TASK>
[70712.343909]  tb_ring_start+0x163/0x310 [thunderbolt]
[70712.343915]  tb_ctl_start+0x22/0xa0 [thunderbolt]
[70712.343920]  tb_domain_runtime_resume+0x15/0x40 [thunderbolt]
[70712.343927]  pci_pm_runtime_resume+0xa7/0xd0
[70712.343929]  ? pci_pm_freeze_noirq+0x110/0x110
[70712.343930]  __rpm_callback+0x43/0x120
[70712.343932]  ? pci_pm_freeze_noirq+0x110/0x110
[70712.343933]  rpm_callback+0x5d/0x70
[70712.343934]  ? pci_pm_freeze_noirq+0x110/0x110
[70712.343935]  rpm_resume+0x506/0x7a0
[70712.343936]  rpm_suspend+0x6e1/0x6f0
[70712.343937]  ? __schedule+0x2e9/0x1350
[70712.343939]  ? pcie_pme_work_fn+0x27b/0x310
[70712.343941]  pm_runtime_work+0x92/0xa0
[70712.343942]  process_one_work+0x1ee/0x390
[70712.343944]  worker_thread+0x53/0x3e0
[70712.343945]  ? process_one_work+0x390/0x390
[70712.343946]  kthread+0x124/0x150
[70712.343947]  ? set_kthread_struct+0x50/0x50
[70712.343949]  ret_from_fork+0x1f/0x30
[70712.343952]  </TASK>
[70712.343952] ---[ end trace f13c01b63824f2b1 ]---
[70712.343999] ------------[ cut here ]------------
[70712.344000] thunderbolt 0000:04:00.0: interrupt for RX ring 0 is already enabled
[70712.344010] WARNING: CPU: 3 PID: 98115 at drivers/thunderbolt/nhi.c:111 ring_interrupt_active+0x1d5/0x240 [thunderbolt]
[70712.344015] Modules linked in: snd_seq_midi snd_seq_midi_event xt_MASQUERADE xt_mark nft_chain_nat nf_nat snd_seq_dummy snd_hrtimer ccm qrtr ns hid_logitech_hidpp uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev hid_logitech_dj snd_usbmidi_lib joydev mousedev mc input_leds af_packet hid_generic btusb btrtl btbcm btintel usbhid bluetooth hid ecdh_generic ecc ip6_tables xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_rpfilter ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog xt_tcpudp nft_compat nft_counter msr nf_tables libcrc32c nfnetlink sch_fq_codel uinput nvidia_uvm(PO) ctr atkbd libps2 serio loop tap macvlan bridge stp llc snd_sof_pci_intel_cnl snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel soundwire_generic_allocation soundwire_cadence intel_rapl_msr tun intel_rapl_common snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof mdev vfio_iommu_type1 vfio soundwire_bus i915 iwlmvm snd_soc_skl snd_soc_sst_ipc
[70712.344035]  snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi mac80211 i2c_designware_platform i2c_designware_core snd_soc_core snd_hda_codec_realtek intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp mei_hdcp eeepc_wmi nvidia_drm(PO) asus_wmi snd_hda_codec_generic iTCO_wdt platform_profile nvidia_modeset(PO) battery intel_pmc_bxt sparse_keymap ledtrig_audio snd_compress coretemp ee1004 watchdog evdev crc32_pclmul libarc4 wmi_bmof intel_wmi_thunderbolt mxm_wmi ttm led_class ac97_bus ghash_clmulni_intel mac_hid snd_hda_codec_hdmi snd_pcm_dmaengine nls_iso8859_1 nvidia(PO) iwlwifi aesni_intel cec snd_hda_intel nls_cp437 snd_intel_dspcfg libaes snd_intel_sdw_acpi crypto_simd snd_hda_codec cmdlinepart cryptd vfat intel_spi_pci rapl intel_lpss_pci fat intel_cstate intel_lpss cfg80211 snd_hda_core intel_spi drm_kms_helper intel_gtt deflate intel_uncore igc spi_nor snd_hwdep mei_me i2c_algo_bit fb_sys_fops idma64 ptp syscopyarea virt_dma thunderbolt sysfillrect mtd
[70712.344053]  efi_pstore snd_pcm mei pps_core sysimgblt rfkill mfd_core thermal fan i2c_i801 i2c_smbus wmi video tiny_power_button intel_pmc_core acpi_pad acpi_tad button kvm_intel kvm drm irqbypass snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore agpgart fuse backlight i2c_core pstore configfs efivarfs ip_tables x_tables autofs4 ext4 crc32c_generic crc16 mbcache jbd2 raid1 sd_mod md_mod ahci libahci xhci_pci xhci_pci_renesas libata xhci_hcd usbcore nvme scsi_mod nvme_core crc32c_intel rtc_cmos t10_pi crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common scsi_common usb_common dm_snapshot dm_bufio dm_mod
[70712.344067] CPU: 3 PID: 98115 Comm: kworker/3:2 Tainted: P        W  O      5.15.78 #1-NixOS
[70712.344068] Hardware name: ASUS System Product Name/ROG STRIX Z490-E GAMING, BIOS 0707 07/21/2020
[70712.344069] Workqueue: pm pm_runtime_work
[70712.344070] RIP: 0010:ring_interrupt_active+0x1d5/0x240 [thunderbolt]
[70712.344075] Code: 00 00 00 44 89 44 24 04 e8 48 8d f3 ef 44 8b 44 24 04 4d 89 f1 4c 89 e1 48 89 c6 4c 89 fa 48 c7 c7 28 36 ee c0 e8 67 11 1c f0 <0f> 0b e9 17 ff ff ff 0f b6 43 78 d3 e0 09 c7 e9 db fe ff ff 44 03
[70712.344075] RSP: 0018:ffffb612460fbc78 EFLAGS: 00010086
[70712.344076] RAX: 0000000000000000 RBX: ffff9b8c86002b40 RCX: 0000000000000027
[70712.344077] RDX: ffff9b93fc8dc648 RSI: 0000000000000001 RDI: ffff9b93fc8dc640
[70712.344077] RBP: 00000000ffffffff R08: 0000000000000000 R09: ffffb612460fbaa8
[70712.344078] R10: ffffb612460fbaa0 R11: ffffffffb1f38da8 R12: ffffffffc0ee21e3
[70712.344078] R13: 0000000000038200 R14: ffffffffc0ee21ca R15: ffff9b8c81ef6030
[70712.344079] FS:  0000000000000000(0000) GS:ffff9b93fc8c0000(0000) knlGS:0000000000000000
[70712.344080] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[70712.344080] CR2: 00002b0e93c3c128 CR3: 0000000328106004 CR4: 00000000007706e0
[70712.344081] PKRU: 55555554
[70712.344081] Call Trace:
[70712.344082]  <TASK>
[70712.344082]  tb_ring_start+0x163/0x310 [thunderbolt]
[70712.344087]  tb_ctl_start+0x2c/0xa0 [thunderbolt]
[70712.344092]  tb_domain_runtime_resume+0x15/0x40 [thunderbolt]
[70712.344098]  pci_pm_runtime_resume+0xa7/0xd0
[70712.344099]  ? pci_pm_freeze_noirq+0x110/0x110
[70712.344100]  __rpm_callback+0x43/0x120
[70712.344101]  ? pci_pm_freeze_noirq+0x110/0x110
[70712.344103]  rpm_callback+0x5d/0x70
[70712.344104]  ? pci_pm_freeze_noirq+0x110/0x110
[70712.344105]  rpm_resume+0x506/0x7a0
[70712.344106]  rpm_suspend+0x6e1/0x6f0
[70712.344107]  ? __schedule+0x2e9/0x1350
[70712.344108]  ? pcie_pme_work_fn+0x27b/0x310
[70712.344110]  pm_runtime_work+0x92/0xa0
[70712.344111]  process_one_work+0x1ee/0x390
[70712.344112]  worker_thread+0x53/0x3e0
[70712.344113]  ? process_one_work+0x390/0x390
[70712.344114]  kthread+0x124/0x150
[70712.344115]  ? set_kthread_struct+0x50/0x50
[70712.344117]  ret_from_fork+0x1f/0x30
[70712.344119]  </TASK>
[70712.344119] ---[ end trace f13c01b63824f2b2 ]---

Additional context

The dock works correctly on macOS.

Notify maintainers

@hmenke @ethics-gradient @jonringer @wizeman @fpletz @globin

Metadata

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 5.15.78, NixOS, 22.05 (Quokka), 22.05.4243.814f8f3363c`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.8.1`
 - channels(minibill): `"home-manager-22.05.tar.gz, unstable"`
 - channels(root): `"musnix, nixos-22.05"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
hmenke commented 2 years ago

Looks like a kernel bug, most likely unrelated to NixOS: https://bugs.archlinux.org/task/73413 https://retrace.fedoraproject.org/faf/problems/512292/

Also you have pinged all of the ZFS maintainers instead of the kernel maintainers.

miniBill commented 2 years ago

Whooops, sorry for pinging the wrong folks šŸ˜… Should I ping the actual maintainers?

Looking into the arch thread it seems 5.16 could fix it but it's not available in nixpkgs? šŸ¤” 6.0 is not compatible with the proprietary nvidia module (I get a compilation error), I'm now trying to revert to 5.10 and see what happens

EDIT: 5.10 is apparently worse, my USB hub is not recognized even when directly attached