GrapheneOS / linux-hardened

Minimal supplement to upstream Kernel Self Protection Project changes. Features already provided by SELinux + Yama and archs other than multiarch arm64 / x86_64 aren't in scope. Only tags have stable history. Shared IRC channel with KSPP: irc.freenode.net ##linux-hardened. Currently maintained at https://github.com/anthraxx/linux-hardened.
https://grapheneos.org/
Other
397 stars 105 forks source link

Ubuntu 17.04 does not boot with CONFIG_SLAB_CANARY enabled #38

Closed dbaxa closed 7 years ago

dbaxa commented 7 years ago

Ubuntu 17.04 does not boot when applying either the 4.11.2.a or the 4.11.2.b patch and building a 4.11.2 kernel. Boot seems to stop just after "loading initramfs". Please let me know what I can do to provide additional details. Note: I am yet to try a vanilla 4.11.2 kernel.

thestinger commented 7 years ago

You'll need to narrow it down a lot more than this. I can't do the work for you, since I don't have access to a machine where it doesn't work. That means testing a vanilla kernel, narrowing down which release introduced the problem and narrowing down which commit introduces the problem. If you can provide logs, I can perhaps do something, but otherwise there's nothing to work with.

thestinger commented 7 years ago

You'll need to confirm that this only happens with linux-hardened before it can be considered a bug.

dbaxa commented 7 years ago

You'll need to narrow it down a lot more than this. I can't do the work for you, since I don't have access to a machine where it doesn't work. That means testing a vanilla kernel, narrowing down which release introduced the problem and narrowing down which commit introduces the problem. If you can provide logs, I can perhaps do something, but otherwise there's nothing to work with.

@thestinger of course. I totally understand.

You'll need to confirm that this only happens with linux-hardened before it can be considered a bug.

Will do.

dbaxa commented 7 years ago

@thestinger The vanilla 4.11.2 kernel works well. Here is a diff of the kernel configuration:

diff -Nur hardened-config .config
--- hardened-config 2017-05-24 09:30:57.330645881 +1000
+++ .config 2017-05-23 14:23:34.909738303 +1000
@@ -254,9 +254,6 @@
 CONFIG_SLUB=y
 # CONFIG_SLOB is not set
 # CONFIG_SLAB_FREELIST_RANDOM is not set
-CONFIG_SLAB_CANARY=y
-CONFIG_SLAB_SANITIZE=y
-CONFIG_SLAB_SANITIZE_VERIFY=y
 CONFIG_SLUB_CPU_PARTIAL=y
 CONFIG_SYSTEM_DATA_VERIFICATION=y
 CONFIG_PROFILING=y
@@ -4985,8 +4982,6 @@
 CONFIG_ENCRYPTED_KEYS=y
 # CONFIG_KEY_DH_OPERATIONS is not set
 CONFIG_SECURITY_DMESG_RESTRICT=y
-CONFIG_SECURITY_TIOCSTI_RESTRICT=y
-CONFIG_SECURITY_PERF_EVENTS_RESTRICT=y
 CONFIG_SECURITY=y
 CONFIG_SECURITYFS=y
 CONFIG_SECURITY_NETWORK=y
@@ -4997,9 +4992,6 @@
 CONFIG_HAVE_ARCH_HARDENED_USERCOPY=y
 CONFIG_HARDENED_USERCOPY=y
 # CONFIG_HARDENED_USERCOPY_PAGESPAN is not set
-CONFIG_FORTIFY_SOURCE=y
-CONFIG_PAGE_SANITIZE=y
-CONFIG_PAGE_SANITIZE_VERIFY=y
 # CONFIG_STATIC_USERMODEHELPER is not set
 # CONFIG_SECURITY_SELINUX is not set
 CONFIG_SECURITY_SMACK=y

I am going to try turning off various options and will let you know what option or set or options causes the issue.

thestinger commented 7 years ago

Can you try with CONFIG_FORTIFY_SOURCE disabled?

dbaxa commented 7 years ago

Sure. Also, it seems that the version of gcc in use is 6.3.0,

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 6.3.0-12ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2) 
dbaxa commented 7 years ago

Disabling CONFIG_FORTIFY_SOURCE did not fix the issue.

dbaxa commented 7 years ago

Disabling CONFIG_PAGE_SANITIZE, CONFIG_PAGE_SANITIZE_VERIFY, CONFIG_SLAB_SANITIZE_VERIFY and CONFIG_SLAB_SANITIZE_VERIFY did not fix the issue either. Also, it seems that the linux-hardned patch currently requires CONFIG_SLAB_CANARY=y.

thestinger commented 7 years ago

Do you mean that it doesn't currently compile without it?

thestinger commented 7 years ago

It's probably not related to the boot issue though. Lots of the changes aren't tied to configuration options only ones with some reason like performance to disable them.

thestinger commented 7 years ago

It can be built without CONFIG_SLAB_CANARY in the 4.11 branch now. I doubt your issue is related to CONFIG_SLAB_CANARY / CONFIG_SLAB_HARDENED though.

dbaxa commented 7 years ago

Do you mean that it doesn't currently compile without it?

Yep.

dbaxa commented 7 years ago

Hmm. This system uses dkms to load bbswitch maybe that is causing an issue.

thestinger commented 7 years ago

You might want to try disabling PANIC_ON_OOPS since one of the changes this tree adds is enabling that by default. It's possible you had a kernel oops before and didn't notice.

thestinger commented 7 years ago

I thought there was a way to disable panic_on_oops via the kernel line but it doesn't appear that it's possible after all.

dbaxa commented 7 years ago

The issue is that when if I remove quiet, splash and vt.handoff I am not seeing an OOPS.

thestinger commented 7 years ago

If you're building with the default PANIC_ON_OOPS it might panic before it was able to show you anything. You should check in dmesg on a vanilla kernel and try with PANIC_ON_OOPS disabled with linux-hardened.

dbaxa commented 7 years ago

I already had PANIC_ON_OOPS disabled and didn't see anything. I'll check dmesg on a vanilla kernel.

thestinger commented 7 years ago

Is this on real hardware or a virtual machine? Can you give some details on that?

dbaxa commented 7 years ago

This is on a real machine. The machine is a Dell XPS 15 2014 model. Here is an old dmesg from the machine https://bugzilla.kernel.org/attachment.cgi?id=190581. It has an integrated intel graphic card that I use and a GeForce GT 750M.

thestinger commented 7 years ago

BTW I tagged 4.11.2.c which should let you disable CONFIG_SLAB_CANARY and CONFIG_SLAB_HARDENED to get closer to vanilla.

dbaxa commented 7 years ago

Great. I'll test out 4.11.2.c.

dbaxa commented 7 years ago

After disabling more hardened related options the system boots a 4.11.2.c kernel.

--- config/hardened-config  2017-05-23 13:35:07.576541812 +1000
+++ current/linux-4.11.2/.config    2017-05-26 08:07:41.045180527 +1000
@@ -254,9 +254,8 @@
 CONFIG_SLUB=y
 # CONFIG_SLOB is not set
 # CONFIG_SLAB_FREELIST_RANDOM is not set
-CONFIG_SLAB_CANARY=y
+# CONFIG_SLAB_CANARY is not set
 CONFIG_SLAB_SANITIZE=y
-CONFIG_SLAB_SANITIZE_VERIFY=y
 CONFIG_SLUB_CPU_PARTIAL=y
 CONFIG_SYSTEM_DATA_VERIFICATION=y
 CONFIG_PROFILING=y
@@ -306,7 +305,7 @@
 CONFIG_HAVE_GCC_PLUGINS=y
 CONFIG_GCC_PLUGINS=y
 # CONFIG_GCC_PLUGIN_CYC_COMPLEXITY is not set
-CONFIG_GCC_PLUGIN_LATENT_ENTROPY=y
+# CONFIG_GCC_PLUGIN_LATENT_ENTROPY is not set
 # CONFIG_GCC_PLUGIN_STRUCTLEAK is not set
 CONFIG_HAVE_CC_STACKPROTECTOR=y
 CONFIG_CC_STACKPROTECTOR=y
@@ -4948,10 +4947,10 @@
 CONFIG_EARLY_PRINTK=y
 CONFIG_EARLY_PRINTK_DBGP=y
 CONFIG_EARLY_PRINTK_EFI=y
-# CONFIG_X86_PTDUMP_CORE is not set
+CONFIG_X86_PTDUMP_CORE=y
 # CONFIG_X86_PTDUMP is not set
 # CONFIG_EFI_PGT_DUMP is not set
-# CONFIG_DEBUG_WX is not set
+CONFIG_DEBUG_WX=y
 CONFIG_DOUBLEFAULT=y
 # CONFIG_DEBUG_TLBFLUSH is not set
 # CONFIG_IOMMU_DEBUG is not set
@@ -4985,7 +4984,7 @@
 CONFIG_ENCRYPTED_KEYS=y
 # CONFIG_KEY_DH_OPERATIONS is not set
 CONFIG_SECURITY_DMESG_RESTRICT=y
-CONFIG_SECURITY_TIOCSTI_RESTRICT=y
+# CONFIG_SECURITY_TIOCSTI_RESTRICT is not set
 CONFIG_SECURITY_PERF_EVENTS_RESTRICT=y
 CONFIG_SECURITY=y
 CONFIG_SECURITYFS=y
@@ -4995,11 +4994,9 @@
 CONFIG_INTEL_TXT=y
 CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
 CONFIG_HAVE_ARCH_HARDENED_USERCOPY=y
-CONFIG_HARDENED_USERCOPY=y
-# CONFIG_HARDENED_USERCOPY_PAGESPAN is not set
-CONFIG_FORTIFY_SOURCE=y
-CONFIG_PAGE_SANITIZE=y
-CONFIG_PAGE_SANITIZE_VERIFY=y
+# CONFIG_HARDENED_USERCOPY is not set
+# CONFIG_FORTIFY_SOURCE is not set
+# CONFIG_PAGE_SANITIZE is not set
 # CONFIG_STATIC_USERMODEHELPER is not set
 # CONFIG_SECURITY_SELINUX is not set
 CONFIG_SECURITY_SMACK=y
thestinger commented 7 years ago

Can you narrow it down to a specific option? You're most of the way there already.

thestinger commented 7 years ago

Current release is 4.11.2.a, so it should be narrowed down to a configuration option there (assuming disabling everything works, which means one of the configuration options controls the problematic feature).

dbaxa commented 7 years ago

Of course. Will do.

dbaxa commented 7 years ago

I am yet to find the option but the following options don't seem to trigger the issue:

CONFIG_SECURITY_TIOCSTI_RESTRICT
CONFIG_GCC_PLUGIN_LATENT_ENTROPY
thestinger commented 7 years ago

BTW CONFIG_GCC_PLUGIN_LATENT_ENTROPY and HARDENED_USERCOPY are really upstream options although some extensions are made to both in linux-hardened.

dbaxa commented 7 years ago

Enabling SLAB_CANARY causes the system to fail to boot.

thestinger commented 7 years ago

Interesting, I wonder if it's catching something or if there's a bug in an edge case code path. Can you try enabling slub_debug=FZ on your kernel line in a build without SLAB_CANARY? That enables some similar debugging checks that are upstream, and you can check the kernel log to see if they're making any noise.

dbaxa commented 7 years ago

Will do.

thestinger commented 7 years ago

I can make a patch causing it to warn instead of trigger a kernel oops as another approach. Unfortunately I can't debug it myself without a way to reproduce the issue and so far it doesn't seem other people have run into it.

dbaxa commented 7 years ago

I can make a patch causing it to warn instead of trigger a kernel oops as another approach.

Sounds good to me. Especially since I don't see anything odd in my kernel logs after booting with slub_debug=FZ.

Thank you for your patience with my bug report.

nmatt0 commented 7 years ago

FYI: When building the kernel without SLAB_CANARY I get the following:

  CC      mm/slub.o
mm/slub.c: In function ‘kmem_cache_alloc_bulk’:
mm/slub.c:3222:9: warning: unused variable ‘k’ [-Wunused-variable]
  int i, k;
         ^
thestinger commented 7 years ago

@nmatt0 https://github.com/copperhead/linux-hardened/commit/786cb0888040dabe404a33039248ee6ff4407169 should fix that warning.

thestinger commented 7 years ago

@dbaxa If you have time to try something else, you can change BUG_ON in this line in mm/slub.c to WARN_ON:

    BUG_ON(*canary != get_canary_value(canary, value));

So, to this:

    WARN_ON(*canary != get_canary_value(canary, value));
dbaxa commented 7 years ago

@thestinger thank you for the pointer. I'll do that.

dbaxa commented 7 years ago

I changed BUG_ON to WARN_ON as you suggested but that didn't seem to help. However, disabling CONFIG_SLAB_SANITIZE (I also change CONFIG_SLUB_DEBUG_ON to y) did result in the system booting and working fine - using the 4.11.6.d patch.

dbaxa commented 7 years ago

@thestinger I was able to reproduce the failure to boot on another older dell laptop which doesn't have an nvidia card. Also, this laptop doesn't have the nvidia driver installed nor is the CONFIG_DRM_NOUVEAU option enabled.

thestinger commented 7 years ago

There might be multiple issues. I can't do much without a traceback though. If it doesn't work without BUG_ON changed to WARN_ON, then it sounds like there's another problem, but I don't really have anywhere to start on that.

dbaxa commented 7 years ago

@thestinger okay I'll just keep changing BUG_ON to WARN_ON till something sticks :-) .

thestinger commented 7 years ago

@dbaxa So that laptop works with CONFIG_SLAB_CANARY disabled and not with it enabled - no changes to other options? I really need you to try to get logs.

thestinger commented 7 years ago

The traceback they provided appears to be an unrelated issue after all.

dbaxa commented 7 years ago

@thestinger the laptop works with CONFIG_SLAB_CANARY enabled when I disabled CONFIG_SLAB_SANITIZE and set CONFIG_SLUB_DEBUG_ON to y.

dbaxa commented 7 years ago

I am going to close this issue for now as my system can boot with a 4.11.6 patched kernel without issue with CONFIG_SLAB_SANITIZE and CONFIG_SLAB_CANARY enabled. For the record I re-enabled CONFIG_FORTIFY_SOURCE and have left CONFIG_SLUB_DEBUG_ON as y. (I am yet to re-enable CONFIG_PAGE_SANITIZE and CONFIG_PAGE_SANITIZE_VERIFY).

thestinger commented 7 years ago

It could just be because CONFIG_SLUB_DEBUG_ON ends up enabling the upstream debug-oriented poisoning and disabling the security-oriented slub sanitization added by linux-hardened.

thestinger commented 7 years ago

You probably don't want CONFIG_SLUB_DEBUG_ON for a hardened kernel, but I thought it might uncover an issue you were hitting.

dbaxa commented 7 years ago

It could just be because CONFIG_SLUB_DEBUG_ON ends up enabling the upstream debug-oriented poisoning and disabling the security-oriented slub sanitization added by linux-hardened.

Okay. I'll disable CONFIG_SLUB_DEBUG_ON and see what happens.

thestinger commented 7 years ago

If it does break, I think you should open up a new issue, because it's a lot clearer what's going on now and the rest of the thread just confuses things.

thestinger commented 7 years ago

@dbaxa The problem is that you weren't enabling SLAB_HARDENED, which should have been a dependency of SLAB_CANARY. I've corrected the issue in the 4.12 branch and it will be in the next 4.12 release.