cebix / macemu

Basilisk II and SheepShaver Macintosh emulators
1.38k stars 288 forks source link

Fix crash on aarch64 linux. #225

Closed SegHaxx closed 3 years ago

SegHaxx commented 3 years ago

Hi Basilisk II fails to start a VM on Debian aarch64, in my case a Raspi4.

With some printf debugging I discovered it is because MAP_BASE defaults to 0x0 which is unavailable for security reasons on recent kernels thus the memory allocator falls back on a standard allocation way up in 64bit address space, which fails the address sanity test and returns a misleading "not enough memory" error to the user.

$ build/nojit/BasiliskII-nojit 
Basilisk II V1.0 by Christian Bauer et al.
mmap next=(nil) got=0x7f9df00000
ERROR: Not enough free memory.

Removing the sanity check results in a VM crash so it would seem low addresses are in fact required for now.

The easy fix is to bump MAP_BASE up past the security/debug guard. I've added some comments explaining the issues involved and the choices made.

This is all a bit ugly and flaky and the real fix would be to make Basilisk II not require low addresses, but that's another patch for another day.

I have not touched the linux i386|FreeBSD|HAVE_LINKER_SCRIPT case as i don't know the issues there and i assume that case is working properly.

Resolves #188

rickyzhang82 commented 3 years ago

No offense. Do you fully understand all 3 addressing mode?

But you really need to understand the memory addressing mode in BII before making this drastic change in this PR.

What is the addressing mode in your configure?

SegHaxx commented 3 years ago

I'm using the default which results in "direct". Setting --enable-addressing=real fails and falls back on "memory banks" which also works for me.

checking for the addressing mode to use... 
configure: WARNING: Sorry, no suitable addressing mode in real
Addressing mode ........................ : memory banks

Just read TECH. If I understand correctly I broke REAL_ADDRESSING=1? So just check for that?

This is just making me further question the fact that linux i386 and FreeBSD set MAP_BASE = 0x10000000 while TECH claims those platforms work with real addressing. :)

As an experiment, I tried setting vm.mmap_min_addr="0" and it still fails to allocate at 0x0 on Debian aarch64 and CentOS 8 x86_64 so yeah, the linux world is very determined to make that impossible going forward and any security minded OS is likely to do the same. See https://access.redhat.com/articles/20484 and regarding FreeBSD they're doing it too: https://utcc.utoronto.ca/~cks/space/blog/unix/MunmapPageZero

Pushed an update that should preserve real addressing on whatever platforms can even do that anymore. (autoconf appears to test for this directly on unix, and macOS has its own code path I didn't touch)

rickyzhang82 commented 3 years ago

I strongly opposed this PR.

Please read the wiki I wrote on addressing.

I'd recommend you to use memory bank and kill this PR.

SegHaxx commented 3 years ago

OK let me explain, the problem here is ALL addressing modes are broken on aarch64 (and every 64bit arch on Linux other than x86_64). This patch fixes all addressing modes. ALL addressing modes are broken without this patch. Basilisk II is completely unusable on aarch64 (and PPC64 and RISC-V) without this patch.

x86_64 works because 1) it and ONLY it implements MAP_32BIT, you can confirm this at https://elixir.bootlin.com/linux/latest/ident/MAP_32BIT so when the allocation fails at the requested address it at least falls back to an address low enough to not crash

...and 2) linux x86/x86_64 (and FreeBSD) uses MAP_BASE=0x10000000 so REAL ADDRESSING IS BROKEN THERE ALREADY AND IS UNAFFECTED BY MY PATCH. Read the code. And you can confirm it with my printf debug.

--- a/BasiliskII/src/CrossPlatform/vm_alloc.cpp
+++ b/BasiliskII/src/CrossPlatform/vm_alloc.cpp
@@ -270,6 +270,7 @@ void * vm_acquire(size_t size, int options)

        if ((addr = mmap((caddr_t)next_address, size, VM_PAGE_DEFAULT, the_map_flags, fd, 0)) == (void *)MAP_FAILED)
                return VM_MAP_FAILED;
+       printf("mmap next=%p got=%p\n",next_address,addr);

        // Sanity checks for 64-bit platforms
        if (sizeof(void *) == 8 && (options & VM_MAP_32BIT) && !((char *)addr <= (char *)0xffffffff))
$ uname -a
Linux bigtimevm 4.18.0-193.28.1.el8_2.x86_64 #1 SMP Thu Oct 22 00:20:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ ./BasiliskII
Basilisk II V1.0 by Christian Bauer et al.
mmap next=0x10000000 got=0x10000000
mmap next=0x12100000 got=0x12100000
Reading ROM file...
selected Ethernet device type slirp
Using SDL/pulseaudio audio output
Using SDL_Renderer driver: opengl
mmap next=0x12110000 got=0x12110000
mmap next=0x1223e000 got=0x1223e000
WARNING: RmvTime(000dec36): Descriptor not found

mmap-ing at 0x0 is being disabled in Linux using numerous methods. Not just the sysctl setting, SELinux is being used to disable it as well on CentOS for example. All other unix OS are quickly following suit.

"real" addressing is simply not going to be possible on modern unix for the average user going forwards. But with my current revision it should still work if you can convince the OS to let it happen.

I have confirmed my patch works correctly on x86_64 and aarch64 Linux, with both "direct" and "memory banks" and with JIT enabled on x86_64. Please merge.

rickyzhang82 commented 3 years ago
rickyzhang82 commented 3 years ago

I'm pretty sure you messed up MAP_BASE for no good reason.

It initializes next_address with NULL.

https://github.com/cebix/macemu/blob/d684527b27ecca9b76a7895abd25b1eba5317a1c/BasiliskII/src/CrossPlatform/vm_alloc.cpp#L97

https://github.com/cebix/macemu/blob/d684527b27ecca9b76a7895abd25b1eba5317a1c/BasiliskII/src/CrossPlatform/vm_alloc.cpp#L255

So that when it mmap, the kernel chooses the (page-aligned) address.

SYNOPSIS         top

       #include <sys/mman.h>

       void *mmap(void *addr, size_t length, int prot, int flags,
                  int fd, off_t offset);
       int munmap(void *addr, size_t length);

       See NOTES for information on feature test macro requirements.
DESCRIPTION         top

       mmap() creates a new mapping in the virtual address space of the
       calling process.  The starting address for the new mapping is
       specified in addr.  The length argument specifies the length of the
       mapping (which must be greater than 0).

       If addr is NULL, then the kernel chooses the (page-aligned) address
       at which to create the mapping; this is the most portable method of
       creating a new mapping.  If addr is not NULL, then the kernel takes
       it as a hint about where to place the mapping; on Linux, the kernel
       will pick a nearby page boundary (but always above or equal to the
       value specified by /proc/sys/vm/mmap_min_addr) and attempt to create
       the mapping there.  If another mapping already exists there, the
       kernel picks a new address that may or may not depend on the hint.
       The address of the new mapping is returned as the result of the call.

See mmap doc

PS:

SegHaxx commented 3 years ago

Yes, I have tried virtual address mode a.k.a memory bank. It is broken without my patch on aarch64. It works with my patch on aarch64. The problem here is the behavior of Linux's mmap. It really doesn't want to give you the behavior you want, for good reasons.

YES, this is all an ugly hack. That's the problem. Go ahead, show me how this breaks for some theoretical PPC and ARM7 users and i'll fix it. Until then, it is actually broken on Linux aarch64, a real thing that exists and is currently selling in the millions. https://www.zdnet.com/article/raspberry-pi-now-weve-sold-30-million/

I've read the mmap doc, I've read the implementations of mmap in the source code of Linux and FreeBSD. I have a pretty good handle on mmap behavior on Linux and FreeBSD kernels across all architectures. I've already explained it to you with relevant links in great detail.

JIT on x86_64 Linux works just fine for me. Please merge.

rickyzhang82 commented 3 years ago

Your first message indicates that you have no ideas of addressing in BII. Also, how many RaPi sells is irrelevant with the correctness of the PR. Please focus on the topic.

In ASLR, how can you be so sure that fixing the starting address to 0x00010000 in mmap from your PR is better than "the kernel chooses the (page-aligned) address at which to create the mapping" where BII sets the address to NULL? How do you know nothing maps to the range 0x00010000?

I'm not convince that your PR benefits for other architectures. It is more likely causing others to break.

The debug message that you printed out the mmap address is NOT related to the PR at all.

I'm not that smart to read mmap source code of Linux or FreeBSD. For a complex subject matter such as memory management, I could never claim that I "have a pretty good handle on mmap behavior on Linux and FreeBSD kernels across all architectures".

If you want to continue the discussion in a productive manner -- First thing first, show me the memory mmap debug message in ARM64 under memory bank addressing or direct addressing without your hack.

SegHaxx commented 3 years ago

You don't have merge permission? You can't read kernel source code? Why am I wasting my time arguing with you then?

Please merge.

SegHaxx commented 3 years ago

Wasn't expecting such weird pushback, making a new PR with this change isolated to a branch so i can move on to other fixes.

ianfixes commented 3 years ago

Discussion continuing in https://github.com/emaculation/macemu/pull/141

@rickyzhang82 are there any automated tests that highlight the problems introduced by this patch on the architectures you mentioned (PPC, ARM7, etc)? Or would you have to have that physical hardware to run the tests?