Open SoloRobo opened 1 month ago
It appears that maximising any window is causing Cosmic to hard freeze where I have to hard reset the physical machine.
I'm wonder if it is because I am using 3:2 screen on a framework 13 (2880x1920) as this is pretty non-standard and perhaps a calculation is assuming 16:9?
Fedora 41 Beta cosmic-comp 1.0.0~alpha.2^git20240923afdb656
This happened before I went to Fedora 41 Beta
Alternatively its simply a combination of options I have with the applets.
Is there any debug info I can generate or find?
Probably related:
When tty switching keybindings don't work, it's possible to use magic sysrq to enter raw mode, so the kernel will handle the tty switch binding. (On many distros, this requires the kernel.sysrq
sysctl to be changed first.)
I'm wonder if it is because I am using 3:2 screen on a framework 13 (2880x1920) as this is pretty non-standard and perhaps a calculation is assuming 16:9?
I don't think resolution would be the problem. The GPU model is probably more relevant. Presumably it's an iGPU. Exactly what model of CPU does the system have?
Perhaps related to direct scanout.
Not sure if there's an easy way to debug, but run a debug build of cosmic-comp, ssh into it from a different system, and attach gdb to the process, to see if I can find what line cosmic-comp is freezing on. (Doing this on a tty, after switching with magic sysrq, may also help). Definitely not the easiest way to test.
Does it produce any dmesg errors?
When tty switching keybindings don't work, it's possible to use magic sysrq to enter raw mode, so the kernel will handle the tty switch binding. (On many distros, this requires the
kernel.sysrq
sysctl to be changed first.)I'm wonder if it is because I am using 3:2 screen on a framework 13 (2880x1920) as this is pretty non-standard and perhaps a calculation is assuming 16:9?
I don't think resolution would be the problem. The GPU model is probably more relevant. Presumably it's an iGPU. Exactly what model of CPU does the system have?
Perhaps related to direct scanout.
Not sure if there's an easy way to debug, but run a debug build of cosmic-comp, ssh into it from a different system, and attach gdb to the process, to see if I can find what line cosmic-comp is freezing on. (Doing this on a tty, after switching with magic sysrq, may also help). Definitely not the easiest way to test.
Does it produce any dmesg errors?
Not OP, but same issue:
OS: Fedora Linux 41 (Workstation Edition) x86_64 Kernel: Linux 6.12.0-0.rc0.20240927gt075dbe9f.413.vanilla.fc41.x86_64 Resolution 2560x1440 @ 100 Hz [External] CPU: AMD Ryzen 5 5600 GPU: AMD Radeon RX 6600 [Discrete]
The problem doesn't occur with kernel 6.10.
Perhaps related to direct scanout.
Is there a way to disable direct scannout? I think direct scannout is causing the issues im having with #868
Good point; we should probably have an env var to test without direct scanout, like Anvil.
You can try this:
diff --git a/src/backend/kms/surface/mod.rs b/src/backend/kms/surface/mod.rs
index d0cfb8d..32aaf4a 100644
--- a/src/backend/kms/surface/mod.rs
+++ b/src/backend/kms/surface/mod.rs
@@ -624,7 +624,8 @@ impl SurfaceThreadState {
cursor_size,
Some(gbm),
) {
- Ok(compositor) => {
+ Ok(mut compositor) => {
+ compositor.use_direct_scanout(false);
self.active.store(true, Ordering::SeqCst);
self.compositor = Some(compositor);
Ok(())
I can also reproduce the problem. It seems to happen when maximizing a window (Super+M), swapping a window (Super+X) or when trying to stack 2 windows (Super+U, Super+S). All actions performed in tiling mode. In all cases a version of this stack trace is printed to dmesg and the UI freezes. SSH into the host still works.
[ 3000.716511] BUG: unable to handle page fault for address: 00000000212d216e
[ 3000.716519] #PF: supervisor read access in kernel mode
[ 3000.716524] #PF: error_code(0x0000) - not-present page
[ 3000.716527] PGD 0 P4D 0
[ 3000.716535] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 3000.716543] CPU: 10 UID: 0 PID: 3064 Comm: kworker/u64:35 Tainted: G O 6.11.0 #1-NixOS
[ 3000.716550] Tainted: [O]=OOT_MODULE
[ 3000.716553] Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.05 03/29/2024
[ 3000.716557] Workqueue: events_unbound commit_work
[ 3000.716581] RIP: 0010:copy_stream_update_to_stream.isra.0+0x2df/0x6f0 [amdgpu]
[ 3000.717078] Code: 1f 48 8b 10 49 89 97 f0 00 00 00 48 8b 50 08 49 89 97 f8 00 00 00 8b 40 10 41 89 87 00 01 00 00 49 8b 44 24 78 48 85 c0 74 0a <0f> b6 00 41 88 87 88 64 00 00 49 8b 44 24 60 48 85 c0 74 36 48 8b
[ 3000.717082] RSP: 0018:ffffa108465eb9d8 EFLAGS: 00010202
[ 3000.717086] RAX: 00000000212d216e RBX: 0000000000000004 RCX: 0000000000000000
[ 3000.717090] RDX: ffff8fb2da8a9e30 RSI: ffff8fb2e37f8000 RDI: 0000000000000000
[ 3000.717093] RBP: ffffa108465eba30 R08: 0000000000000000 R09: 0000000000000000
[ 3000.717095] R10: 0000000000000000 R11: ffff8fb2da8a99e0 R12: ffff8fb2da8a9e30
[ 3000.717098] R13: ffff8fb151000000 R14: ffff8fb2da8a9e30 R15: ffff8fb2e37f8000
[ 3000.717101] FS: 0000000000000000(0000) GS:ffff8fbfa1f00000(0000) knlGS:0000000000000000
[ 3000.717104] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3000.717107] CR2: 00000000212d216e CR3: 000000014473e000 CR4: 0000000000f50ef0
[ 3000.717110] PKRU: 55555554
[ 3000.717113] Call Trace:
[ 3000.717118] <TASK>
[ 3000.717123] ? __die+0x23/0x70
[ 3000.717131] ? page_fault_oops+0x173/0x5a0
[ 3000.717141] ? exc_page_fault+0x71/0x150
[ 3000.717149] ? asm_exc_page_fault+0x26/0x30
[ 3000.717159] ? copy_stream_update_to_stream.isra.0+0x2df/0x6f0 [amdgpu]
[ 3000.717539] ? psi_task_switch+0xd6/0x230
[ 3000.717546] update_planes_and_stream_state+0x23e/0x520 [amdgpu]
[ 3000.717903] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3000.717910] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3000.717914] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3000.717919] ? commit_minimal_transition_state+0x113/0x350 [amdgpu]
[ 3000.718169] update_planes_and_stream_v2+0x1b4/0x5f0 [amdgpu]
[ 3000.718309] ? __entry_text_end+0x101e86/0x101e89
[ 3000.718315] ? dma_fence_array_release+0x7c/0xa0
[ 3000.718318] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3000.718320] ? kfree+0x2b7/0x300
[ 3000.718325] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3000.718326] ? kvfree_call_rcu+0x21f/0x360
[ 3000.718331] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3000.718333] ? srso_alias_return_thunk+0x5/0xfbef5
[ 3000.718335] ? wait_for_completion_timeout+0x135/0x160
[ 3000.718339] ? commit_tail+0x91/0x130
[ 3000.718342] ? process_one_work+0x18f/0x3b0
[ 3000.718346] ? worker_thread+0x21f/0x330
[ 3000.718348] ? __pfx_worker_thread+0x10/0x10
[ 3000.718350] ? kthread+0xcd/0x100
[ 3000.718353] ? __pfx_kthread+0x10/0x10
[ 3000.718355] ? ret_from_fork+0x31/0x50
[ 3000.718359] ? __pfx_kthread+0x10/0x10
[ 3000.718361] ? ret_from_fork_asm+0x1a/0x30
[ 3000.718366] </TASK>
[ 3000.718367] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfcomm ccm af_packet cmac algif_hash algif_skcipher af_alg bnep nls_iso8859_1 nls_cp437 vfat fat iwlmvm snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp xt_conntrack snd_sof_pci nf_conntrack snd_sof_xtensa_dsp mousedev mac80211 nf_defrag_ipv6 nf_defrag_ipv4 snd_sof hid_sensor_als hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common snd_sof_utils industrialio snd_pci_ps snd_amd_sdw_acpi xt_policy soundwire_amd ptp soundwire_generic_allocation soundwire_bus pps_core snd_hda_codec_realtek libarc4 edac_mce_amd ip6t_rpfilter joydev snd_hda_codec_generic snd_hda_scodec_component edac_core snd_soc_core ipt_rpfilter snd_hda_codec_hdmi intel_rapl_msr spd5118 amd_atl snd_compress intel_rapl_common snd_hda_intel ac97_bus hid_multitouch hid_sensor_hub snd_pcm_dmaengine xt_pkttype iwlwifi kvm_amd snd_rpl_pci_acp6x snd_intel_dspcfg snd_intel_sdw_acpi xt_LOG snd_acp_pci
[ 3000.718427] snd_hda_codec snd_acp_legacy_common btusb hid_generic nf_log_syslog snd_hda_core sp5100_tco ip6t_REJECT kvm cfg80211 snd_pci_acp6x btrtl snd_hwdep btintel watchdog amd_pmf snd_pci_acp5x nf_reject_ipv6 snd_pcm amdtee crct10dif_pclmul btbcm snd_rn_pci_acp3x crc32_pclmul ucsi_acpi snd_acp_config ipt_REJECT amd_sfh nf_reject_ipv4 polyval_clmulni i2c_piix4 btmtk snd_timer typec_ucsi snd_soc_acpi polyval_generic cros_ec_hwmon cros_ec_sysfs cros_ec_debugfs cros_ec_chardev snd ghash_clmulni_intel bluetooth rapl typec k10temp framework_laptop(O) tiny_power_button wmi_bmof tpm_crb platform_profile rfkill ccp soundcore snd_pci_acp3x i2c_smbus thermal roles ac i2c_hid_acpi xt_tcpudp i2c_hid tpm_tis cros_charge_control leds_cros_ec hid tpm_tis_core led_class_multicolor evdev cros_usbpd_logger button cros_usbpd_charger battery amd_pmc cros_kbd_led_backlight cros_usbpd_notify mac_hid gpio_cros_ec nft_compat serio_raw cros_ec_dev nf_tables sch_fq_codel loop tun tap macvlan bridge stp llc cros_ec_lpcs cros_ec fuse
[ 3000.718497] efi_pstore configfs nfnetlink efivarfs dmi_sysfs ip_tables x_tables autofs4 dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm rng_core libaescfb ecdh_generic ecc input_leds led_class atkbd xhci_pci libps2 xhci_pci_renesas vivaldi_fmap sha512_ssse3 thunderbolt nvme sha256_ssse3 sha1_ssse3 xhci_hcd aesni_intel nvme_core gf128mul crypto_simd cryptd i8042 nvme_auth rtc_cmos serio amdgpu video wmi backlight amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper drm_buddy drm_display_helper firmware_class cec crc16 dm_mod dax btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
[ 3000.718548] CR2: 00000000212d216e
[ 3000.718550] ---[ end trace 0000000000000000 ]---
[ 3001.038324] pstore: backend (efi_pstore) writing error (-28)
[ 3001.038326] RIP: 0010:copy_stream_update_to_stream.isra.0+0x2df/0x6f0 [amdgpu]
[ 3001.038499] Code: 1f 48 8b 10 49 89 97 f0 00 00 00 48 8b 50 08 49 89 97 f8 00 00 00 8b 40 10 41 89 87 00 01 00 00 49 8b 44 24 78 48 85 c0 74 0a <0f> b6 00 41 88 87 88 64 00 00 49 8b 44 24 60 48 85 c0 74 36 48 8b
[ 3001.038500] RSP: 0018:ffffa108465eb9d8 EFLAGS: 00010202
[ 3001.038502] RAX: 00000000212d216e RBX: 0000000000000004 RCX: 0000000000000000
[ 3001.038504] RDX: ffff8fb2da8a9e30 RSI: ffff8fb2e37f8000 RDI: 0000000000000000
[ 3001.038505] RBP: ffffa108465eba30 R08: 0000000000000000 R09: 0000000000000000
[ 3001.038506] R10: 0000000000000000 R11: ffff8fb2da8a99e0 R12: ffff8fb2da8a9e30
[ 3001.038507] R13: ffff8fb151000000 R14: ffff8fb2da8a9e30 R15: ffff8fb2e37f8000
[ 3001.038508] FS: 0000000000000000(0000) GS:ffff8fbfa1f00000(0000) knlGS:0000000000000000
[ 3001.038509] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3001.038510] CR2: 00000000212d216e CR3: 000000014473e000 CR4: 0000000000f50ef0
[ 3001.038512] PKRU: 55555554
[ 3001.038513] note: kworker/u64:35[3064] exited with irqs disabled
Did not see this posted in one of the issues, hope it helps. Maybe a bug in amdgpu?
EDIT: The problem goes away, when I apply the provided patch for disabling direct scanout.
Also seeing this issue on my Fedora 41 system. This is going to become more of a problem when Fedora 41 comes out.
It must be related to package versions that exist in 41 and not in 40, for example maybe the kernel version
I don't seem to see an issue on an Intel system with Linux 6.11.
Updating my Fedora SD card to Fedora 41 beta, I don't seem to see an issue on the Steam Deck (so AMD RNDA 2) with Linux 6.11.2 either.
I don't seem to see an issue on an Intel system with Linux 6.11.
Updating my Fedora SD card to Fedora 41 beta, I don't seem to see an issue on the Steam Deck (so AMD RNDA 2) with Linux 6.11.2 either.
I'll try it again, and report back
Yep, still hard freezes:
ryanbrue@fedora:~$ rpm -q cosmic-comp
cosmic-comp-1.0.0~alpha.2^git20241008.be38da4-1.fc41.x86_64
ryanbrue@fedora:~$ uname -a
Linux fedora 6.11.2-300.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Oct 4 16:44:08 UTC 2024 x86_64 GNU/Linux
And here's a fastfetch, partially just for fun (not on the cosmic session because it's, well, hard freezing):
I got a log over ssh (since everything freezes on the laptop I trigger this bug on):
CC @Drakulix
I got a log over ssh (since everything freezes on the laptop I trigger this bug on):
CC @Drakulix
Not a solution, but you can sudo dnf install kernel-6.10* --releasever=40 till the bug is fixed.
I got a log over ssh (since everything freezes on the laptop I trigger this bug on): comp-weirdness.log CC @Drakulix
Not a solution, but you can sudo dnf install kernel-6.10* --releasever=40 till the bug is fixed.
Interesting, so that means it's probably a kernel issue being triggered by cosmic-comp. It's possible this is something that can only be fixed upstream then :thinking:
I just finished a git bisect for kernel v6.10 to v6.11. This seems to be the first commit where the freeze happens: https://github.com/torvalds/linux/commit/1b04dcca4fb10dd3834893a60de74edd99f2bfaf
git bisect start
# status: waiting for both good and bad commits
# good: [0c3836482481200ead7b416ca80c68a29cfdaabd] Linux 6.10
git bisect good 0c3836482481200ead7b416ca80c68a29cfdaabd
# status: waiting for bad commit, 1 good commit known
# bad: [98f7e32f20d28ec452afb208f9cffc08448a2652] Linux 6.11
git bisect bad 98f7e32f20d28ec452afb208f9cffc08448a2652
# bad: [b3ce7a30847a54a7f96a35e609303d8afecd460b] Merge tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel
git bisect bad b3ce7a30847a54a7f96a35e609303d8afecd460b
# good: [51835949dda3783d4639cfa74ce13a3c9829de00] Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
git bisect good 51835949dda3783d4639cfa74ce13a3c9829de00
# good: [20baedb8033d0ba6ae382fc9974b481fdb32e7ef] drm/xe/vf: Skip attempt to start GuC PC if VF
git bisect good 20baedb8033d0ba6ae382fc9974b481fdb32e7ef
# good: [b1bc554e009e3aeed7e4cfd2e717c7a34a98c683] Merge tag 'media/v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
git bisect good b1bc554e009e3aeed7e4cfd2e717c7a34a98c683
# bad: [6256274c0182b584e7011077d071f905f2385f64] Merge tag 'mediatek-drm-next-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next
git bisect bad 6256274c0182b584e7011077d071f905f2385f64
# bad: [365aa9f573995b46ca14a24165d85e31160e47b9] Merge tag 'amd-drm-next-6.11-2024-06-22' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
git bisect bad 365aa9f573995b46ca14a24165d85e31160e47b9
# good: [e32e15dbf06d65d70c763a44cc8e32ab409b1d5f] drm/amd/display: Adjust debug msg for usb4/tbt
git bisect good e32e15dbf06d65d70c763a44cc8e32ab409b1d5f
# bad: [9061707976c68899cf2f3b9117c5bbcee8e6872c] drm/amd/display: Remove redundant condition with DEADCODE
git bisect bad 9061707976c68899cf2f3b9117c5bbcee8e6872c
# bad: [989947e90563eee58f37fbbad8a5bb94a3d8af8c] drm/amd/display: populate hardware_release hook for dcn401
git bisect bad 989947e90563eee58f37fbbad8a5bb94a3d8af8c
# bad: [3ddd9c83ff7ac0ead38188425b14d03dc2f2c133] drm/amd/display: remove dpp pipes on failure to update pipe params
git bisect bad 3ddd9c83ff7ac0ead38188425b14d03dc2f2c133
# good: [fcf6a49d79923a234844b8efe830a61f3f0584e4] drm/amd/display: Don't refer to dc_sink in is_dsc_need_re_compute
git bisect good fcf6a49d79923a234844b8efe830a61f3f0584e4
# bad: [1b04dcca4fb10dd3834893a60de74edd99f2bfaf] drm/amd/display: Introduce overlay cursor mode
git bisect bad 1b04dcca4fb10dd3834893a60de74edd99f2bfaf
# good: [fd279d8f45c96886786d7fb5452489efad97093b] drm/amd/display: define abm debug interface
git bisect good fd279d8f45c96886786d7fb5452489efad97093b
# first bad commit: [1b04dcca4fb10dd3834893a60de74edd99f2bfaf] drm/amd/display: Introduce overlay cursor mode
Interesting. I don't think there should be a YUV plane in most of the cases where people are seeing this issue. So it's not entirely clear from the description why it should affect this. But it does make sense that this is a change related to planes.
Most likely this is an amdgpu driver bug. Though I guess we must be doing something different with planes that other compositors if they're not hitting this issue.
I just finished a git bisect for kernel v6.10 to v6.11. This seems to be the first commit where the freeze happens: torvalds/linux@1b04dcc
git bisect start # status: waiting for both good and bad commits # good: [0c3836482481200ead7b416ca80c68a29cfdaabd] Linux 6.10 git bisect good 0c3836482481200ead7b416ca80c68a29cfdaabd # status: waiting for bad commit, 1 good commit known # bad: [98f7e32f20d28ec452afb208f9cffc08448a2652] Linux 6.11 git bisect bad 98f7e32f20d28ec452afb208f9cffc08448a2652 # bad: [b3ce7a30847a54a7f96a35e609303d8afecd460b] Merge tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel git bisect bad b3ce7a30847a54a7f96a35e609303d8afecd460b # good: [51835949dda3783d4639cfa74ce13a3c9829de00] Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next git bisect good 51835949dda3783d4639cfa74ce13a3c9829de00 # good: [20baedb8033d0ba6ae382fc9974b481fdb32e7ef] drm/xe/vf: Skip attempt to start GuC PC if VF git bisect good 20baedb8033d0ba6ae382fc9974b481fdb32e7ef # good: [b1bc554e009e3aeed7e4cfd2e717c7a34a98c683] Merge tag 'media/v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media git bisect good b1bc554e009e3aeed7e4cfd2e717c7a34a98c683 # bad: [6256274c0182b584e7011077d071f905f2385f64] Merge tag 'mediatek-drm-next-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next git bisect bad 6256274c0182b584e7011077d071f905f2385f64 # bad: [365aa9f573995b46ca14a24165d85e31160e47b9] Merge tag 'amd-drm-next-6.11-2024-06-22' of https://gitlab.freedesktop.org/agd5f/linux into drm-next git bisect bad 365aa9f573995b46ca14a24165d85e31160e47b9 # good: [e32e15dbf06d65d70c763a44cc8e32ab409b1d5f] drm/amd/display: Adjust debug msg for usb4/tbt git bisect good e32e15dbf06d65d70c763a44cc8e32ab409b1d5f # bad: [9061707976c68899cf2f3b9117c5bbcee8e6872c] drm/amd/display: Remove redundant condition with DEADCODE git bisect bad 9061707976c68899cf2f3b9117c5bbcee8e6872c # bad: [989947e90563eee58f37fbbad8a5bb94a3d8af8c] drm/amd/display: populate hardware_release hook for dcn401 git bisect bad 989947e90563eee58f37fbbad8a5bb94a3d8af8c # bad: [3ddd9c83ff7ac0ead38188425b14d03dc2f2c133] drm/amd/display: remove dpp pipes on failure to update pipe params git bisect bad 3ddd9c83ff7ac0ead38188425b14d03dc2f2c133 # good: [fcf6a49d79923a234844b8efe830a61f3f0584e4] drm/amd/display: Don't refer to dc_sink in is_dsc_need_re_compute git bisect good fcf6a49d79923a234844b8efe830a61f3f0584e4 # bad: [1b04dcca4fb10dd3834893a60de74edd99f2bfaf] drm/amd/display: Introduce overlay cursor mode git bisect bad 1b04dcca4fb10dd3834893a60de74edd99f2bfaf # good: [fd279d8f45c96886786d7fb5452489efad97093b] drm/amd/display: define abm debug interface git bisect good fd279d8f45c96886786d7fb5452489efad97093b # first bad commit: [1b04dcca4fb10dd3834893a60de74edd99f2bfaf] drm/amd/display: Introduce overlay cursor mode
Maybe this is related?: https://gitlab.freedesktop.org/drm/amd/-/issues/3678
The person provides also a patch that apparently fixes the problem:
From 6e4b5c4a70b81ff10546ad768e35b66fe693a46a Mon Sep 17 00:00:00 2001
From: Hamish Claxton <hamishclaxton@gmail.com>
Date: Sun, 6 Oct 2024 09:36:50 +1000
Subject: [PATCH] drm/amd/display: Fix getting vram info for older GPUs
Signed-off-by: Hamish Claxton <hamishclaxton@gmail.com>
---
drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
index 0d8498ab9b23..242faa079158 100644
--- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c
@@ -3148,7 +3148,7 @@ static enum bp_result bios_parser_get_vram_info(
}
}
- if (result != BP_RESULT_OK && info && DATA_TABLES(vram_info)) {
+ if (info && DATA_TABLES(vram_info)) {
header = GET_IMAGE(struct atom_common_table_header,
DATA_TABLES(vram_info));
--
2.46.2
Potentially, but I'm not running with old hardware. It's possible this fix doesn't just help older GPUs? but I don't feel like patching my kernel to try it out
EDIT: While the patch does say old GPUs, idk if I'd call RX6600 old just yet.
I can also trigger the freeze by changing my output scale to and from 100%<->125%. So changing scale triggers the freeze too
And if my output is at 100% already, I haven't yet been able to trigger the bug -- only at >100%. Will update if I do though
That's interesting. In my case, I cannot seem to trigger a freeze through changing the output scale (or any other display settings for that matter) in the cosmic-settings. Also, cosmic will freeze when maximizing or minimizing any application regardless of my display scaling.
On a different note: I noticed that when I try to drag applications when in tiled mode, I will also trigger a freeze. This somehow only happens on non-cosmic windows though (I've tested Brave, Thunderbird, Spotify and the onlyoffice suite and all of them cause the freeze. On cosmic-settings, cosmic-terminal, cosmic-files and cosmic-text this doesn't happen). As this only happens on some applications, this might well be a different but related issue. I just thought I'd mention it here for completeness.
That's good to know, it's possible that my output change freeze could be different. I started noticing that when I loaded a Fedora 41 bootc image, so it's possible that's a different freeze entirely, and related to that.
Hopefully the log + the issue linked above are enough insight to figure out if this is wholly upstream or whether cosmic-comp needs a change to match AMD's "expected" behavior
I can systematically reproduce the issue a few seconds after starting the session with kernel 6.11.3 on Fedora 40. I've also opened an issue regarding the AMD drivers, thinking it was an AMD driver regression. When starting GNOME Shell, I don't have any problems.
A part of the logs I attached in the report concerning Cosmic:
ott 15 21:20:29 fedora kernel: [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:85:crtc-0] hw_done or flip_done timed out
ott 15 21:20:44 fedora cosmic-comp[1646]: thread 'main' panicked at 'assertion failed: encoded >= 0xf001': /builddir/build/BUILD/cosmic-comp-b8c429facbacbcd0cdda94f717c29b58d9f65414/vendor/rustix/src/backend/linux_raw/io/errno.rs:92
0: <unknown>
1: <unknown>
2: <unknown>
3: <unknown>
4: <unknown>
5: <unknown>
6: <unknown>
7: <unknown>
8: <unknown>
9: <unknown>
10: <unknown>
11: <unknown>
12: <unknown>
13: libinput_dispatch
14: <unknown>
15: <unknown>
16: <unknown>
17: <unknown>
18: <unknown>
19: __libc_start_call_main
20: __libc_start_main@@GLIBC_2.34
21: <unknown>
ott 15 21:20:44 fedora audit[1646]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=1646 comm="cosmic-comp" exe="/usr/bin/cosmic-comp" sig=6 res=1
ott 15 21:20:44 fedora cosmic-comp[1646]: thread 'main' panicked at 'panic in a function that cannot unwind': library/core/src/panicking.rs:221
0: <unknown>
1: <unknown>
2: <unknown>
3: <unknown>
4: <unknown>
5: <unknown>
6: <unknown>
7: <unknown>
8: <unknown>
9: <unknown>
10: <unknown>
11: <unknown>
12: <unknown>
13: libinput_dispatch
14: <unknown>
15: <unknown>
16: <unknown>
17: <unknown>
18: <unknown>
19: __libc_start_call_main
20: __libc_start_main@@GLIBC_2.34
21: <unknown>
ott 15 21:20:44 fedora systemd[1]: Created slice system-systemd\x2dcoredump.slice - Slice /system/systemd-coredump
```.
It appears that maximising any window is causing Cosmic to hard freeze where I have to hard reset the physical machine.
I'm wonder if it is because I am using 3:2 screen on a framework 13 (2880x1920) as this is pretty non-standard and perhaps a calculation is assuming 16:9?
Fedora 41 Beta cosmic-comp 1.0.0~alpha.2^git20240923afdb656
This happened before I went to Fedora 41 Beta
Alternatively its simply a combination of options I have with the applets.
Is there any debug info I can generate or find?