NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.37k stars 13.6k forks source link

Intel graphics performance degraded after upgrade to kernel 4.14 #31999

Closed jflanglois closed 4 years ago

jflanglois commented 6 years ago

Issue description

After switching to kernel 4.14, the screen has flickering and response rate in Xorg is sluggish. The only change is 4.13->4.14. Graphics run smoothly in 4.13.

As a data point, running glxgears produces consistent 60 FPS in 4.13, and around 40 FPS in 4.14 (screen refresh is set to 60Hz in both cases).

This may be an upstream issue but I figured I should start here for troubleshooting.

Steps to reproduce

Switch from kernel 4.13 to kernel 4.14

Technical details

Relevant configuration.nix

boot.kernelPackages = pkgs.linuxPackages_4_14;
hardware.opengl = {
    enable = true;
    driSupport = true;
    driSupport32Bit = true;
    extraPackages = [ pkgs.vaapiIntel ];
    s3tcSupport = true;
};
services.xserver = {
    enable = true;
    videoDriver = "modesetting"; # using intel has the same problem
};
eXt73 commented 6 years ago

I Confirm that. I have the same thing on both the generic kernel under ubuntu and on my builds [optimized kernels for Ubuntu and derivatives]. I have tested for Linux-4.14.3 inclusive and still have very strong slowdowns in intel graphics - including nvidia if it uses bumblebee. Under 4.13.x everything works fine and very efficient

vcunat commented 6 years ago

This is important for the next release, because 4.14 is to become the default kernel.

celentanos commented 6 years ago

I have the same issue in OpenSuse Tumbleweed with kernel 4.14. With kernel 4.13 was all ok.

jflanglois commented 6 years ago

I've opened a bug report at the kernel.org bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=198241

orivej commented 6 years ago

Is someone willing to help bisect the issue on NixOS? You will need a 1 GB clone of git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git and the patience to compile the kernel and reboot about 15 times. I could provide instructions how to do it.

jflanglois commented 6 years ago

I'll give it a shot.

jflanglois commented 6 years ago

@orivej I'm trying to build the kernel using a local clone and git-bisect, but I'm running into an issue where the nix build fails to copy needed kernel modules in order to boot. Maybe you can shed some light on what I'm doing wrong? (I don't want to be a bottleneck on this issue!)

diff --git a/pkgs/os-specific/linux/kernel/linux-bisect.nix b/pkgs/os-specific/linux/kernel/linux-bisect.nix
new file mode 100644
index 00000000000..b75344abeeb
--- /dev/null
+++ b/pkgs/os-specific/linux/kernel/linux-bisect.nix
@@ -0,0 +1,15 @@
+{ stdenv, hostPlatform, fetchurl, perl, buildLinux, ... } @ args:
+
+with stdenv.lib;
+
+import ./generic.nix (args // rec {
+  version = "4.13.0";
+
+  # modDirVersion needs to be x.y.z, will automatically add .0 if needed
+  modDirVersion = concatStrings (intersperse "." (take 3 (splitString "." "${version}.0")));
+
+  # branchVersion needs to be x.y
+  extraMeta.branch = concatStrings (intersperse "." (take 2 (splitString "." version)));
+
+  src = /home/julien/development/linux;
+} // (args.argsOverride or {}))
diff --git a/pkgs/top-level/all-packages.nix b/pkgs/top-level/all-packages.nix
index cea7a7478ad..b8028f11d91 100644
--- a/pkgs/top-level/all-packages.nix
+++ b/pkgs/top-level/all-packages.nix
@@ -12710,6 +12710,10 @@ with pkgs;
       ];
   };

+  linux_bisect = callPackage ../os-specific/linux/kernel/linux-bisect.nix {
+    # To make life simple with patch files during bisect
+    kernelPatches = [ ];
+  };
+
   linux_testing = callPackage ../os-specific/linux/kernel/linux-testing.nix {
     kernelPatches = [
       kernelPatches.bridge_stp_helper
@@ -12908,6 +12912,7 @@ with pkgs;
   linuxPackages_4_9 = recurseIntoAttrs (linuxPackagesFor pkgs.linux_4_9);
   linuxPackages_4_13 = recurseIntoAttrs (linuxPackagesFor pkgs.linux_4_13);
   linuxPackages_4_14 = recurseIntoAttrs (linuxPackagesFor pkgs.linux_4_14);
+  linuxPackages_bisect = recurseIntoAttrs (linuxPackagesFor pkgs.linux_bisect);
   # Don't forget to update linuxPackages_latest!

   # Intentionally lacks recurseIntoAttrs, as -rc kernels will quite likely break out-of-tree modules and cause failed Hydra builds.

And my configuration.nix has a line with boot.kernelPackages = pkgs.linuxPackages_bisect.

A sample of the build log:

[...]
shrinking /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/build/arch/x86/purgatory/purgatory.o
wrong ELF type
shrinking /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/build/arch/x86/purgatory/stack.o
wrong ELF type
shrinking /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/build/arch/x86/purgatory/setup-x86_64.o
wrong ELF type
shrinking /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/build/arch/x86/purgatory/sha256.o
wrong ELF type
shrinking /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/build/arch/x86/purgatory/entry64.o
wrong ELF type
shrinking /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/build/arch/x86/purgatory/string.o
wrong ELF type
shrinking /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/build/arch/x86/purgatory/purgatory.ro
wrong ELF type
stripping (with flags -S) in /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib 
patching script interpreter paths in /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev
/nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/source/scripts/Lindent: interpreter directive changed from "/bin/sh" to "/nix/store/65l6hr8snf4v823f974k97jc65i7bhvf-bash-4.4-p12/bin/sh"
/nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/source/scripts/adjust_autoksyms.sh: interpreter directive changed from "/bin/sh" to "/nix/store/65l6hr8snf4v823f974k97jc65i7bhvf-bash-4.4-p12/bin/sh"
[...]
/nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/source/scripts/ver_linux: interpreter directive changed from "/usr/bin/awk -f" to "/nix/store/w1cddj0qc3ximvpwrn28rig7wq99ajd7-gawk-4.2.0/bin/awk -f"
/nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev/lib/modules/4.13.16/source/scripts/xz_wrap.sh: interpreter directive changed from "/bin/sh" to "/nix/store/65l6hr8snf4v823f974k97jc65i7bhvf-bash-4.4-p12/bin/sh"
checking for references to /tmp/nix-build-linux-4.13.16.drv-0 in /nix/store/ybaf89bssmmg9xzj9b84hd24yda8qpqy-linux-4.13.16-dev...
cannot find section .dynamic
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
wrong ELF type
building path(s) ‘/nix/store/n0na849z6z3zkzj9lp6k2dxmj6zb4s46-firmware’
building path(s) ‘/nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules’
created 714 symlinks in user environment
created 3 symlinks in user environment
building path(s) ‘/nix/store/x4yg3h6mb9wbxaih52jz9rf8pxxbh8bp-etc-nixos.conf’
kernel version is 4.13.16
building path(s) ‘/nix/store/4m0mxic9pscnms5196hgaky8c7lk36gs-kernel-modules-shrunk’
building path(s) ‘/nix/store/hs7japymy96mk9qmkg1vfp5gysc20b92-etc’
kernel version is 4.13.16
root module: xhci_pci
modprobe: FATAL: Module xhci_pci not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: nvme
modprobe: FATAL: Module nvme not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: usb_storage
modprobe: FATAL: Module usb_storage not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: sd_mod
modprobe: FATAL: Module sd_mod not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: md_mod
modprobe: FATAL: Module md_mod not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: raid0
modprobe: FATAL: Module raid0 not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
[...]
root module: pcips2
modprobe: FATAL: Module pcips2 not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: atkbd
modprobe: FATAL: Module atkbd not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: i8042
modprobe: FATAL: Module i8042 not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: rtc_cmos
modprobe: FATAL: Module rtc_cmos not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: btrfs
modprobe: FATAL: Module btrfs not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: crc32c
modprobe: FATAL: Module crc32c not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
root module: dm_mod
modprobe: FATAL: Module dm_mod not found in directory /nix/store/zn8qdl4kqy8h70a8kq1255hkkjympvlj-kernel-modules/lib/modules/4.13.16
closure:
depmod: WARNING: could not open /nix/store/4m0mxic9pscnms5196hgaky8c7lk36gs-kernel-modules-shrunk/lib/modules/4.13.16/modules.order: No such file or directory
depmod: WARNING: could not open /nix/store/4m0mxic9pscnms5196hgaky8c7lk36gs-kernel-modules-shrunk/lib/modules/4.13.16/modules.builtin: No such file or directory
building path(s) ‘/nix/store/zcsx152xxx5kzyxs98276v1ah36kwb7g-stage-1-init.sh’
building path(s) ‘/nix/store/xsk8xx43dg3x1njkdj049kf3m5475r7m-initrd’
37458 blocks
building path(s) ‘/nix/store/qd4awss8r2h2pbrh8bwcg8a63hc968dp-nixos-system-gbpits-18.03.git.1bc2885’
jflanglois commented 6 years ago

Oh it appears I should probably include at least some of those kernel patches. Trying again...

jflanglois commented 6 years ago

Okay, so I can get to a boot with if I include both bridge_stp_helper and modinst_arg_list_too_long (the second of which makes sense would affect module loading). FYI for anyone who would be trying to do this in the future.

jflanglois commented 6 years ago
dc911f5bd8aacfcf8aabd5c26c88e04c837a938e is the first bad commit
commit dc911f5bd8aacfcf8aabd5c26c88e04c837a938e
Author: Jim Bride <jim.bride@linux.intel.com>
Date:   Wed Aug 9 12:48:53 2017 -0700
    drm/i915/edp: Allow alternate fixed mode for eDP if available.
    Some fixed resolution panels actually support more than one mode,
    with the only thing different being the refresh rate.  Having this
    alternate mode available to us is desirable, because it allows us to
    test PSR on panels whose setup time at the preferred mode is too long.
    With this patch we allow the use of the alternate mode if it's
    available and it was specifically requested.
    v2 and v3: Rebase
    v4: * Fix up some leaky mode stuff (Chris)
        * Rebase
    v5: * Fix a NULL pointer derefrence (David Weinehall)
    v6: * Whitespace / spelling / checkpatch clean-up; no functional
          change. (David)
        * Rebase
    Cc: David Weinehall <david.weinehall@linux.intel.com>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
    Cc: Jani Nikula <jani.nikula@intel.com>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
    Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/1502308133-26892-1-git-send-email-jim.bride@linux.intel.com
:040000 040000 359878e8318228ad9525ca05cc03c74783c91c1b a6199aae57157bf243cbc288ae4cbbc9aa473de5 M      drivers

So I guess it makes sense that that would affect things for some people... My panel has 60Hz and 40Hz rates at 1920x1080. If I go into a mode that only has 60Hz available, then performance goes back to normal.

jflanglois commented 6 years ago

I didn't find anything in xrandr that could control this behavior. Anyone have any suggestions?

jflanglois commented 6 years ago

Trying i915.enable_psr=0 kernel argument has no effect.

mirh commented 6 years ago

A guy on manjaro forums said to try [something along the lines of] cvt 1920 1080 60

jflanglois commented 6 years ago

@mirh can you post a link to that?

mirh commented 6 years ago

https://forum.manjaro.org/t/poor-opengl-performance-on-linux-4-14/35453/150

ghost commented 6 years ago

https://bugs.freedesktop.org/show_bug.cgi?id=103497

jflanglois commented 6 years ago

@mirh Thanks. Setting a custom mode as the preferred certainly works (I'm doing that currently), but it really seems like a workaround to me.

vcunat commented 6 years ago

I don't know, it seems stalled. So we apply a revert of that commit for our 4.14, as suggested in https://bugs.freedesktop.org/show_bug.cgi?id=103497#c11 ? What do you think?

joepie91 commented 6 years ago

Ping - I'm unsure if this is the same issue I'm seeing, but I've noticed significant performance degradation since switching to 18.03, both on my desktop and my laptop (although more severe on the latter, probably because it has less powerful hardware).

The issues are most notable in Firefox, which regularly just hangs for upwards of 10-20 seconds - in particular when typing things into the address bar - but I've noticed performance degradation elsewhere as well (including in my laptop's trackpoint being more jittery than before).

When Firefox hangs, it is only pegging a single core, and there is no significant iowait, yet it's still impacting other applications such as Synergy which get stuck as well, despite having no direct relation to Firefox. The two have a dependency on X and the kernel in common, at least, but nothing else that I'm aware of.

Does this seem related to this issue?

EDIT: For clarification, my laptop has an Intel integrated GPU (no hybrid graphics), my desktop has an AMD card with the radeon driver.

vcunat commented 6 years ago

This issue doesn't really do a "performance" degradation but incorrect refresh rate in some cases. Therefore it doesn't seem to be the same issue (to me).

oxij commented 6 years ago

Hint: you can just perf your firefox from both system generations (17.09 and master) and compare the reports to see what changed.

I would also perf under with an old kernel (like 4.9) on master and compare with the newer ones.

Rotsor commented 6 years ago

My laptop was affected as well. It looks like they reverted the change in https://cgit.freedesktop.org/drm-tip/commit/?id=d93fa1b47b8fcd149b5091f18385304f402a8e15, and among official kernels it looks like only 4.18 got this change.

dezgeg commented 6 years ago

It does have a Cc: <stable@vger.kernel.org> # v4.14+ tag so it will eventually get to later kernels.

Or well, at least it should.

mirh commented 6 years ago

https://www.spinics.net/lists/stable/msg246751.html

dezgeg commented 6 years ago

It could be useful to report to the Bugzilla ticket that the 4.14 backport wasn't successful.

vcunat commented 6 years ago

This part seems relevant

Note: This is a hand-crafted revert due to conflicts. If it fails to backport, please just try reverting the original commit directly.

I presume it's about "backport a revert" vs. "revert a backport".

buckley310 commented 5 years ago

+1 I believe I am also affected by this same issue on release-18.09. When on the stock kernel, my laptop display runs at 48hz, even when xrandr reports 60. (Intel HD Graphics 620)

Currently avoiding the issue thanks to this patch I apply locally: https://bugs.freedesktop.org/attachment.cgi?id=135277

worldofpeace commented 4 years ago

This is now out of date.