Recent kernel causes -fPIE ASan executables to abort on x86_64

jcowgill commented 7 years ago

This is just a heads-up about this Linux kernel commit recently committed and pending on a number of stable queues: torvalds/linux@eab09532d40090698b05a07c1c87f39fdbc5fab5

It seems to adjust move the default load address for -fPIE executables into the location ASan uses for its shadow memory map (on x86_64). This then causes ASan to abort on startup. Example error:

$ ./a.out
==5661==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
==5661==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
==5661==Process memory map follows:
    0x000cb5280000-0x000cb5281000   /var/tmp/a.out
    0x000cb5480000-0x000cb5481000   /var/tmp/a.out
    0x000cb5481000-0x000cb5482000   /var/tmp/a.out
    0x7f6d4f9ca000-0x7f6d4fd1c000   
    0x7f6d4fd1c000-0x7f6d4fd32000   /lib/x86_64-linux-gnu/libgcc_s.so.1
    0x7f6d4fd32000-0x7f6d4ff31000   /lib/x86_64-linux-gnu/libgcc_s.so.1
    0x7f6d4ff31000-0x7f6d4ff32000   /lib/x86_64-linux-gnu/libgcc_s.so.1
    0x7f6d4ff32000-0x7f6d4ff33000   /lib/x86_64-linux-gnu/libgcc_s.so.1
[...]

With ASLR enabled, you can sometimes get lucky with the load address and the program runs, but most of the time ASan aborts with this error.

Is it possible for ASan to be a bit more flexible about where it places the shadow map on startup to fix this?

kcc commented 7 years ago

It is possible at the code size and execution time cost, which we are not willing to pay. Any chance to get the kernel to cooperate?

kcc commented 7 years ago

This would not be the first time when the kernel change breaks the sanitizers. The last significant one was by H.J. Lu when he changed the based from 0x7.... to 0x555.... It caused lots of trouble for us in msan and tsan.

What we really need here is to tell at link time where the shadow is. AFAICT, there is no such capability currently.

rnk commented 7 years ago

I always wondered if it would be possible to express the shadow mapping as an ELF program header. That would be the ultimate way to communicate shadow memory needs to the kernel.

jcowgill commented 7 years ago

I'm not sure - I'm just a user who happened to stumble across the bug. You might be able to get them to change where the executable gets mapped, but they could argue that PIE executables should be prepared to be loaded at any address.

What we really need here is to tell at link time where the shadow is.

I don't see how that is possible with PIE / ASLR. The entire point is that you don't know where the executable will be loaded, so you can't know what bits of memory will be free until runtime.

kcc commented 7 years ago

@dvyukov @xairy @ramosian-glider FYI

pcc commented 7 years ago

We could have a program header that means "please reserve the first N bytes of the address space for the application". Then the kernel can use that as a minimum for ELF_ET_DYN_BASE.

kcc commented 7 years ago

@dvyukov can you confirm that the fresh kernel breaks the sanitizers?

bennofs commented 7 years ago

I think I am hitting this bug:

$  ./loadaddr 
==16572==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
==16572==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
==16572==Process memory map follows:
    0x04daa6e91000-0x04daa6fc6000   /tmp/loadaddr
    0x04daa71c6000-0x04daa71c7000   /tmp/loadaddr
    0x04daa71c7000-0x04daa71ca000   /tmp/loadaddr
    0x04daa71ca000-0x04daa7e2f000   
    0x7b742c072000-0x7b742c3c4000   
    0x7b742c3c4000-0x7b742c559000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libc-2.25.so
    0x7b742c559000-0x7b742c759000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libc-2.25.so
    0x7b742c759000-0x7b742c75d000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libc-2.25.so
    0x7b742c75d000-0x7b742c75f000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libc-2.25.so
    0x7b742c75f000-0x7b742c763000   
    0x7b742c763000-0x7b742c779000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libgcc_s.so.1
    0x7b742c779000-0x7b742c978000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libgcc_s.so.1
    0x7b742c978000-0x7b742c979000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libgcc_s.so.1
    0x7b742c979000-0x7b742c97c000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libdl-2.25.so
    0x7b742c97c000-0x7b742cb7b000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libdl-2.25.so
    0x7b742cb7b000-0x7b742cb7c000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libdl-2.25.so
    0x7b742cb7c000-0x7b742cb7d000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libdl-2.25.so
    0x7b742cb7d000-0x7b742cc8e000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libm-2.25.so
    0x7b742cc8e000-0x7b742ce8e000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libm-2.25.so
    0x7b742ce8e000-0x7b742ce8f000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libm-2.25.so
    0x7b742ce8f000-0x7b742ce90000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libm-2.25.so
    0x7b742ce90000-0x7b742ce97000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/librt-2.25.so
    0x7b742ce97000-0x7b742d096000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/librt-2.25.so
    0x7b742d096000-0x7b742d097000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/librt-2.25.so
    0x7b742d097000-0x7b742d098000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/librt-2.25.so
    0x7b742d098000-0x7b742d0b1000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libpthread-2.25.so
    0x7b742d0b1000-0x7b742d2b0000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libpthread-2.25.so
    0x7b742d2b0000-0x7b742d2b1000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libpthread-2.25.so
    0x7b742d2b1000-0x7b742d2b2000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/libpthread-2.25.so
    0x7b742d2b2000-0x7b742d2b6000   
    0x7b742d2b6000-0x7b742d2d9000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/ld-2.25.so
    0x7b742d4a8000-0x7b742d4bc000   
    0x7b742d4c0000-0x7b742d4d9000   
    0x7b742d4d9000-0x7b742d4da000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/ld-2.25.so
    0x7b742d4da000-0x7b742d4db000   /nix/store/l48biijfr1j6d5kdg911051x2phfjrz7-glibc-2.25/lib/ld-2.25.so
    0x7b742d4db000-0x7b742d4dc000   
    0x7fff06e24000-0x7fff06e46000   [stack]
    0x7fff06ef0000-0x7fff06ef2000   [vvar]
    0x7fff06ef2000-0x7fff06ef4000   [vdso]
    0xffffffffff600000-0xffffffffff601000   [vsyscall]
==16572==End of process memory map.
c-cube:/tmp uname -a
Linux c-cube 4.9.39 #1-NixOS SMP Fri Jul 21 05:42:36 UTC 2017 x86_64 GNU/Linux

(Just compiled a trivial Hello world with -fsanitize=address)

bennofs commented 7 years ago

A possible workaround seems to be the following:

$  .../ld-2.25.so ./loadaddr

That way, loadaddr will be loaded by ld.so, which uses mmap so loadaddr ends up in the mmap region which is way higher than the PIE base.

(Yes, my ld.so is in weird path, that's just NixOS things :)

FSMaxB commented 7 years ago

I independently bisected this in the kernel and opened a bug there: https://bugzilla.kernel.org/show_bug.cgi?id=196537 but didn't have a lot of knowledge about the underlying issues.

richfelker commented 7 years ago

Bringing this over from twitter (https://twitter.com/kayseesee/status/894594085608013825), my basic view is that this is a bug in the ASAN library code. Assuming you can use a particular virtual address range is not valid (it could already be in use for some reason, as you're now seeing), and even if it were valid, it's not safe for something that can be used in deployment; it exposes potentially sensitive information at an attacker-known address. ASAN simply needs to pay the cost of using a variable address chosen at runtime.

kcc commented 7 years ago

@richfelker ASAN has been using fixed addresses since 2011. I know kernel does not guarantee anything like this, but it worked, and it provided performance and code size benefits over using a dynamic shadow base (which we also have now, as an option, off by default on linux)

ASAN simply needs to pay the cost of using a variable address chosen at runtime.

That's one way to look at it. But a much better resolution would be to have a kernel<=>userspace interface that allows to use a fixed address. And in the meantime, revert the change that broke ASAN.

safe for something that can be used in deployment

If you want to discuss this topic, please open a separate issue, let's not mix too many things in a single place.

richfelker commented 7 years ago

Like I said on on the initial Twitter thread, I don't think I have much of value to say beyond "I think what you're doing is badly wrong" and "it happened to work before is not a good argument to do it (or for changing the kernel)". If we disagree then we disagree...

FSMaxB commented 7 years ago

@kcc: You mentioned a dynamic shadow base. Could you please elaborate on that.

Is that available in the current stable release of LLVM? And if yes, can you point me to some documentation please.

I think that information would be useful for downstream projects that find the runtime overhead of a dynamic shadow base is acceptable.

bennofs commented 7 years ago

And in the meantime, revert the change that broke ASAN.

@kcc I don't think this is good advice. Pretty sure that the change fixes some security issue, so you shouldn't revert that.

richfelker commented 7 years ago

I agree strongly with @bennofs. Address assignment/ASLR for production systems should not be tiptoeing around (and possibly impacting security) for the sake of a tool that's only suitable in debugging situations and not production. I'd like ASAN to be usable in production (which is why I mentioned that above) but at present it's not.

kcc commented 7 years ago

One more discussion thread is here: http://marc.info/?t=149973272100048&r=1&w=2

kcc commented 7 years ago

@kcc: You mentioned a dynamic shadow base. Could you please elaborate on that.

In clang there is -mllvm -asan-force-dynamic-shadow=1, which is the default on Windows. I don't think this has been implemented in GCC. This is currently an implementation detail (on windows), not documented.

should not be tiptoeing around

All these arguments are perfectly valid, but who is going to pay for the increased CPU usage and code size? Or, if we end up supporting both configurations on linux (dynamic and static) who is going to pay for the extra maintenance overhead?

We really need to come up with a solution where the application requests a fixed address range at startup and the kernel can't refuse.

FSMaxB commented 7 years ago

@kcc: Forcing the dynamic shadow doesn't work on my system! (Archlinux x86_64 with clang 4.0.1)

kcc commented 7 years ago

@FSMaxB please open a separate bug with details. But please note: this flag is not officially supported.

richfelker commented 7 years ago

Requesting a fixed address range at startup is non-PIE. Normal non-PIE ELF already has a way to do that: PT_LOAD segments (e.g. with PROT_NONE or just BSS you can MAP_FIXED over later). The whole point of an executable being PIE is that it doesn't demand specific addresses.

Being that current kernels don't, and future kernels probably won't, support the invalid usage of assuming a particular fixed address range is free, the fixed address mode should just be removed and dynamic always used. This will simplify the amount of code that needs to be maintained anyway (since Windows already needs dynamic). Performance is not likely to be significantly worse, but ASAN already performs badly and is intended and understood as a costly (but less so than some other approaches) tool for debugging (and possibly in the future, for hardening).

kcc commented 7 years ago

Asan's shadow being at a fixed offset does not really contradict PIE -- the rest of the addresses could be anywhere they want to (except for the shadow region).

BTW, I am trying to get the fresh perf numbers on spec for static vs dynamic shadow.

richfelker commented 7 years ago

The view I'm putting forward, which you're free to disagree with but I think is worthwhile, is that the definition of PIE is "no fixed mappings", not "some non-fixed mappings". In this definition, PIE ELF programs can even be loaded in rather esoteric environments like a shared address space (multiple programs in the same process) or a nommu system (where all processes share an address space). There are very good reasons to consider any fixed mappings a design bug; in places where they've been used recently, they've repeatedly come back to bite the designers and users. The Linux/glibc x86_64 "vsyscall" mess, ARM kuserhelper page, etc. come to mind.

richfelker commented 7 years ago

BTW my view of these matters is somewhat broader than "Linux" because I'm thinking of/interested in the usage case of non-Linux implementations loading and executing programs using the Linux user-kernel ABI. This sort of generality is part of why I disagree with the view that the kernel is obligated to lay out memory the same way past versions did.

yugr commented 7 years ago

@kcc

BTW, I am trying to get the fresh perf numbers on spec for static vs dynamic shadow.

May make sense to measure sanitized DSOs (where __asan_shadow_memory_dynamic_address is GOT-relocated), rather than sanitized executables.

yugr commented 7 years ago

@richfelker

I'd like ASAN to be usable in production (which is why I mentioned that above) but at present it's not.

Relevant discussion in oss-security

kcc commented 7 years ago

I've done an overnight run of SPEC2006 on my machine. The results are surprisingly close. But the run-to-run variation is too high, I'll need to find a less noisy machine.

                                   static              dynamic
       400.perlbench,      1605.00,      1647.00,         1.03   << dynamic is 3% slower
           401.bzip2,       779.00,       797.00,         1.02
             403.gcc,       660.00,       686.00,         1.04
             429.mcf,       593.00,       503.00,         0.85   << very noisy test
           445.gobmk,       960.00,       956.00,         1.00
           456.hmmer,       809.00,       812.00,         1.00
           458.sjeng,      1214.00,      1227.00,         1.01
      462.libquantum,       435.00,       442.00,         1.02
         464.h264ref,      1193.00,      1207.00,         1.01
         471.omnetpp,       881.00,       904.00,         1.03
           473.astar,       704.00,       672.00,         0.95  << dynamic is 5% faster!
       483.xalancbmk,      1252.00,      1216.00,         0.97
            433.milc,       860.00,       837.00,         0.97
            444.namd,       583.00,       590.00,         1.01
          447.dealII,      1659.00,      1627.00,         0.98
          450.soplex,       454.00,       476.00,         1.05
          453.povray,       648.00,       630.00,         0.97
             470.lbm,       478.00,       460.00,         0.96
         482.sphinx3,       811.00,       798.00,         0.98

I was also surprised to see that the code size with dynamic shadow is actually better (~0.3%). Well, looking at the objdump it makes sense:

Dynamic:

 9a8a66:       80 3c 01 00             cmpb   $0x0,(%rcx,%rax,1)

Static:

  41fd36:       80 b8 00 80 ff 7f 00    cmpb   $0x0,0x7fff8000(%rax)

Next steps:

find a proper noise-free machine for benchmarking
check what happens with PIC/PIE, where loading the shadow base is more expensive
check what's going on on ARM (I'll certainly need help with that)

kcc commented 7 years ago

The difference between regular executables and PIE: Regular:

  4e7f74:   4c 8b 35 9d 2c 44 00    mov    0x442c9d(%rip),%r14        # 92ac18 <__asan_shadow_memory_dynamic_address>

PIE (or -shared-libasan):

   e9504:   48 8d 05 0d 27 44 00    lea    0x44270d(%rip),%rax        # 52bc18 <__asan_shadow_memory_dynamic_address>
   e950b:   4c 8b 30                mov    (%rax),%r14

pcc commented 7 years ago

It looks like the linker is applying relocation relaxation in the PIE/-shared-libasan case, so we end up with a single indirection in the final executable. If you look at the object files you should see two mov instructions.

Are you sure you are linking against the libasan DSO when you build with -shared-libasan? I'd expect to see two movs in the executable unless libasan is being linked statically.

kcc commented 7 years ago

% clang++ -fsanitize=address -O1  a.cc -mllvm -asan-force-dynamic-shadow=1  && objdump -d a.out | grep "<main>:" -A 6 
  4e7f74:   4c 8b 35 9d 2c 44 00    mov    0x442c9d(%rip),%r14        # 92ac18 <__asan_shadow_memory_dynamic_address>
% clang++ -fsanitize=address -O1  a.cc -mllvm -asan-force-dynamic-shadow=1  -shared-libasan  && objdump -d a.out | grep "<main>:" -A 6 
  4007a4:   4c 8b 35 b5 08 20 00    mov    0x2008b5(%rip),%r14        # 601060 <__TMC_END__>
% clang++ -fsanitize=address -O1  a.cc -mllvm -asan-force-dynamic-shadow=1  -fPIE -pie  && objdump -d a.out | grep "<main>:" -A 6 
   e9504:   48 8d 05 0d 27 44 00    lea    0x44270d(%rip),%rax        # 52bc18 <__asan_shadow_memory_dynamic_address>
   e950b:   4c 8b 30                mov    (%rax),%r14
% clang++ -fsanitize=address -O1  a.cc -mllvm -asan-force-dynamic-shadow=1  -fPIE -pie -shared-libasan  && objdump -d a.out | grep "<main>:" -A 6 
 984:   48 8b 05 6d 06 20 00    mov    0x20066d(%rip),%rax        # 200ff8 <_DYNAMIC+0x258>
 98b:   4c 8b 30                mov    (%rax),%r14
% ldd a.out | grep asan 
    libclang_rt.asan-x86_64.so => not found

So, -fPIE -pie -shared-libasan gives us two loads.

kcc commented 7 years ago

% clang++ -fsanitize=address -O1  a.cc -mllvm -asan-force-dynamic-shadow=1  -fPIC -shared  && objdump -d a.out | grep "<main>:" -A 6 
 874:   48 8b 05 7d 07 20 00    mov    0x20077d(%rip),%rax        # 200ff8 <_DYNAMIC+0x218>
 87b:   4c 8b 30                mov    (%rax),%r14

rnk commented 7 years ago

If we care about ELF + dynamic shadow base, we should duplicate the shadow base global into every DSO. We could add a hidden visibility comdat global with the shadow base to every object file and let the linker merge them. A high priority initializer would set it. This is similar to what we do on Windows.

richfelker commented 7 years ago

That seems workable, but before pulling in heavy machinery like that there should be some justification, i.e. a measurement that shows it makes a significant difference. The whole reason we have this problem to begin with is because somebody decided to do a premature optimization with a fixed shadow base address that apparently made virtually no performance difference...

kcc commented 7 years ago

somebody decided

That was me in 2011, and I've made measurements at that time and they were in favor of my decision. Looks like not any more (not 100% confident though, independent evaluation is welcome)

dvyukov commented 7 years ago

Do we/does it make sense/possible to mark the global with some special attributes so that compiler knows that it never changes in generated code under any circumstances, so that it can freely cache it in a register across functions/calls/loops?

rnk commented 7 years ago

@dvyukov Right now dynamic shadow base is only loaded once per function call. The load (or two loads for DSOs) happen the prologue, and that value is typically allocated to a register live across the whole function. Unfortunately, I think LLVM's rematerialization is primitive. It mostly rematerializes constants.

dtzWill commented 7 years ago

Do we/does it make sense/possible to mark the global with some special attributes so that compiler knows that it never changes in generated code under any circumstances, so that it can freely cache it in a register across functions/calls/loops?

At least in LLVM you can-- global declarations can be marked const for pretty much this purpose, excerpt from the LLVM LangRef

LLVM explicitly allows declarations of global variables to be marked constant, even if the final definition of the global is not. This capability can be used to enable slightly better optimization of the program, but requires the language definition to guarantee that optimizations based on the ‘constantness’ are valid for the translation units that do not include the definition.

dtzWill commented 7 years ago

That was me in 2011, and I've made measurements at that time and they were in favor of my decision. Looks like not any more (not 100% confident though, independent evaluation is welcome)

I've never benchmarked ASAN but I've benchmarked thoroughly various shadow-memory systems (taint tracking, etc.) and I can confirm that a constant shadow location is a small but significant optimization. I can search to see if I have any charts handy.

The biggest win IIRC was that a constant address let you be clever about selecting your shadow memory range such that mapping program pointers to their shadow location could be done in fewer instructions (how many depended on the "density" of the mapping, 1:1 or are you bit-packing?).

I was also inlining the runtime, not sure what ASAN does in this regard.

(there are multiple papers about the efficient engineering of these things, FWIW)

kcc commented 7 years ago

ASan's mapping is 8=>1 (no bit packing though, details here: https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#mapping

When I last checked a few years ago, the big difference was between using 0, 0x7fff8000 and something like (1ULL << 43). '0' is the fastest and provides the smallest code but does not work with non-PIE binaries on linux (we use 0 base on Android)

(1ULL << 43) or some such was used for a while, but then Jakub Jelenek suggested 0x7fff8000 as a compromise between 0 and (1ULL << 43). 0x7fff8000 on x86_64 gave us most of the code size and most of the performance of 0 with a much greater compatibility.

chefmax commented 7 years ago

Forcing the dynamic shadow doesn't work on my system! (Archlinux x86_64 with clang 4.0.1)

AFAIK dynamic shadow isn't supported in ASan runtime for Linux (FindAvailableMemoryRange contains UNREACHABLE) so that's expected. Possible implementation would be just to mmap a large chunk for shadow, probably with some hint, in this routine.

check what's going on on ARM (I'll certainly need help with that)

FYI I'm trying to get numbers on my ARM Linux board, but I'll get some results only till the mid of next week (SPEC2006 is very time consuming on my weak ARM board).

In clang there is -mllvm -asan-force-dynamic-shadow=1, which is the default on Windows. I don't think this has been implemented in GCC.

Yes, this is not implemented in GCC, but I don't think it's hard to do (I have a patch that passes GCC ASan bootstrap, but it needs some polishing).

chefmax commented 7 years ago

FYI I'm trying to get numbers on my ARM Linux board, but I'll get some results only till the mid of next week (SPEC2006 is very time consuming on my weak ARM board).

So, I've got some numbers on my ARM Linux board. I've used SPEC2006 train size (the board almost died under ref), but even with train noise between test runs was quite low (~1%) for most tests (except perl and hmmer, where noise was ~3%):

Static CFLAGS= -O2 -fPIC -pie -shared-libasan Dynamic CFLAGS=-O2 -fPIC -mllvm -asan-force-dynamic-shadow=1 -pie -shared-libasan Processor:

processor       : 0
model name      : ARMv7 Processor rev 4 (v7l)
Features        : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm 
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x0
CPU part        : 0xc0f
CPU revision    : 4

Test	static	dynamic	dynamic slowdown (less is better)
400.perlbench	401	413	2.9%
401.bzip2	232	238	2.5%
429.mcf	99.5	101	1.5%
445.gobmk	918	921	0.3%
456.hmmer	292	300	2.7%
458.sjeng	1610	1622	0.7%
471.omnetpp	850	852	0.2%
473.astar	415	425	2.4%
483.xalancbmk	774	777	0.4%
433.milc	78.4	79.9	1.9%
444.namd	52.7	54.4	3.2%
447.dealII	175	192	9.7%
450.soplex	38.2	38.7	1.3%
453.povray	78.2	81.6	4.3%
470.lbm	197	197	0.0%

eugenis commented 7 years ago

Btw, ASan on 32-bit Android maps shadow at 0000 0000 .. 2000 0000, because all executables are PIE, and it is slightly faster that way (and requires less code). This is now broken.

eugenis commented 7 years ago

If we care about ELF + dynamic shadow base, we should duplicate the shadow base global into every DSO. We could add a hidden visibility comdat global with the shadow base to every object file and let the linker merge them. A high priority initializer would set it. This is similar to what we do on Windows.

This will not always work. If library A depends on library B, then a constructor of B may call A before A's constructors have ran.

morehouse commented 6 years ago

The kernel commit was ultimately reverted. Do we want to keep this issue open?

eugenis commented 6 years ago

I don't think it was reverted.

eugenis commented 6 years ago

Oh, I think it was reverted in Ubuntu kernel, but not in upstream.

CLanguagePurist commented 2 years ago

I am writing this for everyone who are trying to find a solution to the problem of running sanitizer on Linux and arrive at this thread from googling. As you might infers from the problem described in this thread, you have to disable ASLR on Linux via "nokaslr" option to be able to run sanitizer, but that put you at a potential security risk, so what I would recommends is to do the followings:

Create a virtual machine
Install Linux Distro of your choice
Ensure that Distro does not use hardened linux kernel variant
Configure grub.cfg or whatever boot configuration to include "nokaslr" option at the end of kernel line in the VM
Compile your binary with "-fsanitize=address" for both compile arguments and linker arguments
Reboot VM and simply copy your binary over to VM and run it to view sanitizer output

rnk commented 2 years ago

If we need to fix this, I think the best solution would be to use the dynamic shadow offset feature (-mllvm -asan-force-dynamic-shadow=1) already used on other OSs. My understanding is that the majority of non-Linux platforms (Windows, Android, Mac, iOS) use a dynamic shadow memory base address.

@kcc had concerns in 2017 about the performance of this change. He ran some benchmarks in this comment, and the results were in the noise. If someone can produce new results on a less noisy machine, I don't think there are any other objections. We can make the change and fix this issue for good.

Maybe this interacts with the new ASan codegen that @kda added, I'm not sure.

CLanguagePurist commented 2 years ago

I don't know if they will ever come around to fixing this issue since this have been around for 5 years, I submitted a workaround for this until then.

thurstond commented 1 year ago

Oh, I think it was reverted in Ubuntu kernel, but not in upstream.

It looks like it was reverted upstream in August 2017: https://github.com/torvalds/linux/commit/c715b72c1ba406f133217b509044c38d8e714a37

There was then a minor fix in November 2017 for 5-level-paging (https://github.com/torvalds/linux/commit/be739f4b5ddece74ef25e2304b17a7fd24575e9b), but it has no impact on this issue; that's the last time ELF_ET_DYN_BASE was modified for x64.

This means there is only a very narrow time window from when the breaking change was made (July 2017) and reverted (August 2017); any kernel outside of that 5-week period should be compatible with ASan.

google / sanitizers

Recent kernel causes -fPIE ASan executables to abort on x86_64 #837