Open hvarga opened 5 days ago
This seems like a similar issue to #1536. @lostgoat can you check this one out as well?
Did this previously work in SteamOS 3.5? Or has it always been broken in this configuration for you. ah I see the relevant stuff in the kernel now based off the previous issue
I don't know @matte-schwartz. This is a new monitor for me. Haven't had it before so I can't tell whether this worked before or not.
I own a Samsung G95NC (a 57" variation of your monitor) and tried to replicate your current setup, and I get several different kernel crashes than your own when utilizing the PiP mode, as well as one that looks like your own.
The crashes also happen when using amd-staging-drm-next
, which is the upstream AMD development kernel, so I will look for any open issues in drm/amd that match my own crashes and file new a report if necessary
<6>[ 60.229618] PM: suspend exit
<6>[ 64.246407] [drm] DM_MST: stopping TM on aconnector: 00000000a7391675 [id: 102]
<1>[ 64.640848] BUG: unable to handle page fault for address: 0000000000006460
<1>[ 64.640860] #PF: supervisor read access in kernel mode
<1>[ 64.640866] #PF: error_code(0x0000) - not-present page
<6>[ 64.640871] PGD 0 P4D 0
<4>[ 64.640880] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[ 64.640887] CPU: 0 PID: 150 Comm: kworker/0:1H Not tainted 6.10.0-1-amd-staging-drm-next-git-g52a0eae4140a #1 c566a4aad02216e5fbb6301f598af73561be116c
<4>[ 64.640897] Hardware name: Valve Galileo/Galileo, BIOS F7G0112 08/01/2024
<4>[ 64.640902] Workqueue: events_highpri dm_irq_work_func [amdgpu]
<4>[ 64.641449] RIP: 0010:dc_stream_get_status+0x9/0x30 [amdgpu]
<4>[ 64.641818] Code: 00 00 01 00 00 00 48 89 d8 5b e9 72 55 89 d1 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 <48> 8b 87 60 64 00 00 48 89 fe 48 8b 00 48 8b b8 80 05 00 00 e9 be
<4>[ 64.641822] RSP: 0018:ffffb33cc1957ac8 EFLAGS: 00210246
<4>[ 64.641825] RAX: 0000000000000000 RBX: ffff9987a4740000 RCX: 0000000000000000
<4>[ 64.641828] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
<4>[ 64.641830] RBP: 0000000000000000 R08: ffffb33cc1957c48 R09: ffff9987a4740000
<4>[ 64.641832] R10: ffff9987a4740308 R11: 0000000000000000 R12: 0000000000000000
<4>[ 64.641834] R13: 0000000000000000 R14: ffff99861a800000 R15: ffffb33cc1957c48
<4>[ 64.641837] FS: 0000000000000000(0000) GS:ffff99892ec00000(0000) knlGS:0000000000000000
<4>[ 64.641839] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 64.641842] CR2: 0000000000006460 CR3: 000000042c820000 CR4: 0000000000350ef0
<4>[ 64.641844] Call Trace:
<4>[ 64.641849] <TASK>
<4>[ 64.641854] ? __die_body.cold+0x19/0x27
<4>[ 64.641861] ? page_fault_oops+0x15a/0x2d0
<4>[ 64.641868] ? exc_page_fault+0x7e/0x180
<4>[ 64.641873] ? asm_exc_page_fault+0x26/0x30
<4>[ 64.641881] ? dc_stream_get_status+0x9/0x30 [amdgpu a564aed2e44e870c4c9c16b11df8af8947853a7d]
<4>[ 64.642245] ? srso_return_thunk+0x5/0x5f
<4>[ 64.642250] update_planes_and_stream_v1+0x8a/0x4d0 [amdgpu a564aed2e44e870c4c9c16b11df8af8947853a7d]
<4>[ 64.642622] dc_commit_updates_for_stream+0x54/0x110 [amdgpu a564aed2e44e870c4c9c16b11df8af8947853a7d]
<4>[ 64.642984] ? link_get_master_pipes_with_dpms_on+0x38/0x80 [amdgpu a564aed2e44e870c4c9c16b11df8af8947853a7d]
<4>[ 64.643388] link_set_all_streams_dpms_off_for_link+0xc5/0x110 [amdgpu a564aed2e44e870c4c9c16b11df8af8947853a7d]
<4>[ 64.643841] link_detect+0x3f9/0x520 [amdgpu a564aed2e44e870c4c9c16b11df8af8947853a7d]
<4>[ 64.644253] handle_hpd_irq_helper+0x116/0x190 [amdgpu a564aed2e44e870c4c9c16b11df8af8947853a7d]
<4>[ 64.644658] process_one_work+0x177/0x330
<4>[ 64.644666] worker_thread+0x266/0x3a0
<4>[ 64.644671] ? __pfx_worker_thread+0x10/0x10
<4>[ 64.644675] kthread+0xd2/0x100
<4>[ 64.644679] ? __pfx_kthread+0x10/0x10
<4>[ 64.644682] ret_from_fork+0x34/0x50
<4>[ 64.644687] ? __pfx_kthread+0x10/0x10
<4>[ 64.644690] ret_from_fork_asm+0x1a/0x30
<4>[ 64.644698] </TASK>
<4>[ 64.644700] Modules linked in: tls ccm michael_mic uinput snd_seq_dummy snd_hrtimer snd_seq rfcomm snd_seq_device nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security nf_tables ip6table_filter ip6_tables iptable_filter cmac algif_hash algif_skcipher af_alg bnep ramoops reed_solomon qrtr_mhi joydev mousedev intel_rapl_msr amdgpu intel_rapl_common snd_soc_acp5x_mach snd_acp5x_pcm_dma snd_acp5x_i2s snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp qrtr snd_sof_pci snd_sof_xtensa_dsp ath11k_pci snd_sof ath11k edac_mce_amd snd_hda_codec_hdmi amdxcp drm_exec snd_sof_utils gpu_sched snd_pci_ps qmi_helpers snd_hda_intel kvm_amd drm_buddy snd_rpl_pci_acp6x hid_multitouch snd_acp_pci i2c_algo_bit hci_uart snd_acp_legacy_common mac80211 snd_intel_dspcfg btqca snd_pci_acp6x
<4>[ 64.644786] drm_suballoc_helper kvm snd_intel_sdw_acpi snd_pci_acp5x btrtl snd_hda_codec snd_rn_pci_acp3x drm_ttm_helper crct10dif_pclmul snd_soc_max98388 hid_apple snd_acp_config crc32_pclmul libarc4 snd_soc_nau8821 btintel snd_hda_core ttm snd_soc_acpi polyval_clmulni btbcm apple_mfi_fastcharge snd_pci_acp3x snd_hwdep cdc_acm ccp polyval_generic hid_steam snd_soc_core drm_display_helper gf128mul cfg80211 bluetooth ghash_clmulni_intel snd_compress sha512_ssse3 ac97_bus sha1_ssse3 cdc_mbim snd_pcm_dmaengine atkbd aesni_intel cdc_wdm snd_pcm sp5100_tco video rfkill crypto_simd libps2 cryptd rapl vivaldi_fmap wdat_wdt pcspkr snd_timer mhi i2c_piix4 wmi ltrf216a snd i2c_hid_acpi i2c_hid industrialio soundcore 8250_dw cdc_ncm cdc_ether usbnet mac_hid mii pkcs8_key_parser hid_playstation led_class_multicolor ff_memless i2c_dev crypto_user fuse loop dm_mod nfnetlink zram bpf_preload ip_tables x_tables overlay mmc_block ext4 crc16 mbcache jbd2 usbhid vfat fat xhci_plat_hcd btrfs blake2b_generic libcrc32c crc32c_generic xor
<4>[ 64.644889] dwc3 raid6_pq ulpi sdhci_pci udc_core cqhci roles serio_raw sdhci crc32c_intel nvme sha256_ssse3 mmc_core nvme_core dwc3_pci xhci_pci i8042 xhci_pci_renesas serio spi_amd
<4>[ 64.644915] CR2: 0000000000006460
<4>[ 64.644919] ---[ end trace 0000000000000000 ]---
there were similar reports but none matched exactly, so reported here for now: https://gitlab.freedesktop.org/drm/amd/-/issues/3783
Your system information
Note that the issue has also been seen on main as well as on Preview channel.
Please describe your issue in as much detail as possible:
Steam Deck, after resuming from sleep, triggers restart. Issue is easily reproducible on my Steam Deck every time.
This issue only happens in case when Steam Deck is connected to Steam Dock which is connected to a single monitor using MST configuration. Meaning, both HDMI and DisplayPort connected to a single monitor, Samsung C49RG90SSR 1, configured to use PBP (Picture By Picture). When the same Steam Deck is connected to a different monitor over a single cable (either HDMI or DisplayPort), the issue is not reproducible. I am able to normally wake up Steam Deck from a sleep and start working without any restarts. I didn't tried to connect to this 1 using a single cable over HDMI or DisplayPort, though. But I am assuming that this will not use MST configuration just like my second monitor and hence will not have an issue.
After some analysis, I have concluded that the cause of restart after waking up from sleep is a Linux crash. Just before the crash, a warning has been emitted:
And then after that we can see the that Linux tried to dereference a NULL pointer which cause it to crash and restart:
This is the trace from the kdumpst-202411230913.zip generated from the preview channel.
Steps for reproducing this issue:
After the third step, the Steam Deck will reboot into Game Mode which is not expected. Instead, it should remain in the Desktop Mode.
Not sure if this has any significance, but I haven't tried sleeping and resuming Steam Deck from the game mode since primarily I use desktop mode.