ublue-os / bazzite

Bazzite is a cloud native image built upon Fedora Atomic Desktops that brings the best of Linux gaming to all of your devices - including your favorite handheld.
https://bazzite.gg
Apache License 2.0
4.01k stars 242 forks source link

[bazzite-nvidia] Major performance degration with 6.7.9-206.fsync.fc39.x86_64 #873

Closed jadams closed 8 months ago

jadams commented 8 months ago

Describe the bug

All performance degraded when booting into latest update with 6.7.9-206.fsync.fc39.x86_64 kernel. Including soft locks and multiple seconds before apps load, running from nvme. GPU accelerated apps that previously ran at >100fps were running at <20fps

Kdiskmark speeds were >3000MB/s read and >2000MB/s write on 6.7.9-203.fsync.fc39.x86_64

Kdiskmark speeds were <2000MB/s read and <500MB/s write on 6.7.9-203.fsync.fc39.x86_64

What did you expect to happen?

No performance degradation

Output of rpm-ostree status

State: idle
AutomaticUpdates: disabled
Deployments:
● ostree-image-signed:docker://ghcr.io/ublue-os/bazzite-nvidia:stable (index: 0)
                   Digest: sha256:8f73e32026fcb68b124cd41f7f840aee30d7106ddf55015321f63b32adb2ca95
                  Version: 39.20240116.0 (2024-03-10T08:18:55Z)
               BaseCommit: 90c1d1dcbf361f1ee3f7d915a40cb983bce62843b06b072c1ae0315d5829d9d0
                   Commit: e0d79bb52c4866eb0486a9bb59bc587c6f758c39fb7d91f8ab36302145b0c8b5
                           ├─ copr:copr.fedorainfracloud.org:codifryed:CoolerControl (2024-03-02T04:13:42Z)
                           ├─ copr:copr.fedorainfracloud.org:matte-schwartz:sunshine (2024-03-11T06:10:51Z)
                           ├─ copr:copr.fedorainfracloud.org:sentry:kernel-fsync (2024-03-10T23:04:57Z)
                           ├─ fedora (2023-11-01T00:12:39Z)
                           ├─ rpmfusion-free (2023-11-04T16:49:08Z)
                           ├─ rpmfusion-free-updates (2024-03-10T16:19:26Z)
                           ├─ rpmfusion-free-updates-testing (2024-03-10T16:20:05Z)
                           ├─ rpmfusion-nonfree (2023-11-04T17:26:32Z)
                           ├─ rpmfusion-nonfree-updates (2024-03-10T16:49:29Z)
                           ├─ rpmfusion-nonfree-updates-testing (2024-03-10T16:49:36Z)
                           ├─ updates (2024-03-11T01:31:42Z)
                           └─ updates-archive (2024-03-11T01:57:51Z)
                   Staged: no
                StateRoot: default
          LayeredPackages: coolercontrol sunshine
                Initramfs: '"-I /etc/crypttab /usr/lib/modprobe.d/nvidia.conf"' 
                   Pinned: yes

  ostree-image-signed:docker://ghcr.io/ublue-os/bazzite-nvidia:stable (index: 1)
                   Digest: sha256:77f12ce64d87bcde10bfa0d4854eda43b916a637a5a4d89aa0770845b5667b1a
                  Version: 39.20240116.0 (2024-03-11T07:06:33Z)
               BaseCommit: 4c1937935d7f03c9aa32dacbdc9c4868ffd6ee30370139e78d32adc56b9a4760
                   Commit: 893ff2a108cfd50f39532b34d5f399d27b834d429d45df814bb35c193087144d
                           ├─ copr:copr.fedorainfracloud.org:codifryed:CoolerControl (2024-03-02T04:13:42Z)
                           ├─ copr:copr.fedorainfracloud.org:matte-schwartz:sunshine (2024-03-11T06:10:51Z)
                           ├─ copr:copr.fedorainfracloud.org:sentry:kernel-fsync (2024-03-10T23:04:57Z)
                           ├─ fedora (2023-11-01T00:12:39Z)
                           ├─ rpmfusion-free (2023-11-04T16:49:08Z)
                           ├─ rpmfusion-free-updates (2024-03-10T16:19:26Z)
                           ├─ rpmfusion-free-updates-testing (2024-03-10T16:20:05Z)
                           ├─ rpmfusion-nonfree (2023-11-04T17:26:32Z)
                           ├─ rpmfusion-nonfree-updates (2024-03-10T16:49:29Z)
                           ├─ rpmfusion-nonfree-updates-testing (2024-03-10T16:49:36Z)
                           ├─ updates (2024-03-11T01:31:42Z)
                           └─ updates-archive (2024-03-11T01:57:51Z)
                StateRoot: default
          LayeredPackages: coolercontrol sunshine
                Initramfs: '"-I /etc/crypttab /usr/lib/modprobe.d/nvidia.conf"' 

Hardware

System:
  Host: fedora Kernel: 6.7.9-203.fsync.fc39.x86_64 arch: x86_64 bits: 64
    compiler: gcc v: 2.40-14.fc39
  Desktop: KDE Plasma v: 5.27.10 Distro: Fedora Linux 39.20240310.0
    (Kinoite)
Machine:
  Type: Desktop System: ASUS product: N/A v: N/A serial: <superuser required>
  Mobo: ASUSTeK model: ROG CROSSHAIR VIII DARK HERO v: Rev X.0x
    serial: <superuser required> UEFI: American Megatrends v: 4702
    date: 10/20/2023
Battery:
  Device-1: hidpp_battery_0 model: Logitech G305 Lightspeed Wireless Gaming
    Mouse charge: 100% (should be ignored) status: discharging
CPU:
  Info: 16-core model: AMD Ryzen 9 5950X bits: 64 type: MT MCP arch: Zen 3+
    rev: 0 cache: L1: 1024 KiB L2: 8 MiB L3: 64 MiB
  Speed (MHz): avg: 1121 high: 3600 min/max: 550/5084 cores: 1: 3593 2: 550
    3: 550 4: 550 5: 550 6: 550 7: 550 8: 3599 9: 3600 10: 550 11: 550 12: 550
    13: 550 14: 3596 15: 3600 16: 550 17: 550 18: 550 19: 550 20: 550 21: 550
    22: 550 23: 550 24: 550 25: 550 26: 550 27: 550 28: 550 29: 550 30: 550
    31: 3597 32: 550 bogomips: 217602
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
  Device-1: NVIDIA GA102 [GeForce RTX 3090] driver: nvidia v: 550.54.14
    arch: Ampere bus-ID: 0c:00.0
  Display: wayland server: X.org v: 1.20.14 with: Xwayland v: 21.1.99
    compositor: kwin_wayland driver: X: loaded: modesetting,nouveau,nvidia
    unloaded: fbdev,vesa gpu: nvidia resolution: 2560x1440
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 550.54.14
    glx-v: 1.4 direct-render: yes renderer: NVIDIA GeForce RTX 3090/PCIe/SSE2
Network:
  Device-1: Realtek RTL8125 2.5GbE vendor: ASUSTeK driver: r8169 v: kernel
    port: e000 bus-ID: 05:00.0
  IF: enp5s0 state: down mac: f0:2f:74:ad:bc:22
  Device-2: Aquantia AQtion AQC107S NBase-T/IEEE 802.3an Ethernet [Atlantic
    10G] vendor: Sonnet driver: atlantic v: kernel port: N/A bus-ID: 06:00.0
  IF: enp6s0 state: up speed: 10000 Mbps duplex: full mac: 00:30:93:14:10:9f
  Device-3: Intel I211 Gigabit Network vendor: ASUSTeK driver: igb v: kernel
    port: d000 bus-ID: 07:00.0
  IF: enp7s0 state: down mac: f0:2f:74:ad:bc:21
  Device-4: Intel Wi-Fi 6 AX200 driver: iwlwifi v: kernel bus-ID: 08:00.0
  IF: wlp8s0 state: down mac: be:13:1d:15:22:4e
  IF-ID-1: tailscale0 state: unknown speed: -1 duplex: full mac: N/A
Drives:
  Local Storage: total: 3.73 TiB used: 104.19 GiB (2.7%)
  ID-1: /dev/nvme0n1 vendor: A-Data model: SX8200PNP size: 1.86 TiB
    temp: 44.9 C
  ID-2: /dev/nvme1n1 vendor: A-Data model: SX8200PNP size: 1.86 TiB
    temp: 42.9 C
Partition:
  ID-1: /boot size: 973.4 MiB used: 280.5 MiB (28.8%) fs: ext4
    dev: /dev/nvme1n1p2
  ID-2: /boot/efi size: 598.8 MiB used: 12.4 MiB (2.1%) fs: vfat
    dev: /dev/nvme1n1p1
  ID-3: /var size: 1.86 TiB used: 103.91 GiB (5.5%) fs: btrfs
    dev: /dev/nvme1n1p3
Info:
  Memory: total: 64 GiB note: est. available: 62.7 GiB used: 5.46 GiB (8.7%)
  Processes: 605 Uptime: 14m Init: systemd target: graphical (5)
  Packages: 61 Compilers: clang: 17.0.6 gcc: 13.2.1 Shell: Bash v: 5.2.26
    inxi: 3.3.33

Extra information or context

Rolled back deployment, pinned current (203), and disabled auto update to stay on 203 until resolved.

Kidswiss commented 8 months ago

Can confirm, also on an all AMD host. It's basically unusably slow unfortunately.

JoshuaMacklin commented 8 months ago

+1 on Bazzite Gnome. AMD CPU and GPU, would love to see if others on intel cpus are effected aswell

KyleGospo commented 8 months ago

Should be fixed soon, thanks for the reports. For the meantime I recommend pinning the previous image.

KyleGospo commented 8 months ago

This is now fixed.

Sorry the correction took so long, I'm out sick atm and a number of maintainers are out for SCALE and similar events.

Fix is to version gate the fsync kernel, meaning we can conduct more testing of changes and very quickly revert should a situation like this arise again without asking users to pin a build.

Hopefully this is the last time an issue like this stays open as long as this one has, assuming it slips by testing in the first place.

lm209 commented 8 months ago

Same on my Steam Deck.

KyleGospo commented 8 months ago

Same on my Steam Deck.

If you update it'll be fixed.

lm209 commented 8 months ago

Yes i updated, but no different

KyleGospo commented 8 months ago

This would be a different issue then, is this an LCD deck or OLED?

lm209 commented 8 months ago

This would be a different issue then, is this an LCD deck or OLED?

I had an update to kernel 6.7.4 2 hours ago. everything was fine there. Now I'm back on 6.7.9 and the performance got worse

lm209 commented 8 months ago

Steam Deck LCD

KyleGospo commented 8 months ago

Steam Deck LCD

Definitely unrelated then, the LCD deck has AMD p-state unavailable due to a firmware limitation, and this issue was a bug in the amd-pstate driver in the kernel. Please open a new issue and include all the detail you can.

lm209 commented 8 months ago

OK thx i did a new issue for this