s117 / anycore-riscv

The AnyCore toolset targetting the RISC-V ISA
Other
0 stars 0 forks source link

403.gcc_ref failed with "Cannot allocate 131072 bytes" #9

Closed s117 closed 4 years ago

s117 commented 4 years ago

SPEC2006 403.gcc_ref failed with an error prompt "Cannot allocate 131072 bytes".

I gave the simulator 16GB physical memory, so it should be a memory allocator issue rather than the real out of memory.

s117 commented 4 years ago

The core trace shows before this error happens, the glibc served a malloc request with mmap().

For a mmap() request without MAP_FIXED flag, PK will use the first unmapped continuous VM region after the current "brk" to serve it.

After that, if the userspace program attempts to expand its heap using brk(), and if the new "brk" goes into the region that was previously mapped by mmap() call, the PK will not serve it [1] [2]. And that's exactly why this error happened.

Heres three pictures to help demostrate the concept:

photo_2020-08-27_02-31-56

photo_2020-08-27_02-31-51

photo_2020-08-27_02-31-13

s117 commented 4 years ago

One solution (https://github.com/s117/riscv-glibc/commit/c484da610d8d97ea8b8e14a9e35838bd68b7b5b8) to this problem is to disable the use of mmap() in glibc's memory allocator, as documented at https://www.gnu.org/software/libc/manual/html_node/Malloc-Tunable-Parameters.html

There're reasons for serving large memory allocation with mmap(). AFAIK, disabling it in a real OS results in at least:

  1. Less efficient resource sharing (the unused memory cannot be returned to OS immediately).

  2. Increased memory fragmentation issue (available VM space can be scattered in the brk() maintained heap).

But here, since PK is a single program execution environment, we don’t really care about the first drawback. And we can alleviate the second by increasing the total available physical memory spike -m<larger mem> (yeah, I know, virtual space resources are less a problem in 64bit machine, but don't forget PK's VM logic can only do a direct map from the virtual page to physical page, e.g. PPN=VPN, so with PK you don't have the entire virtual space available).

s117 commented 4 years ago

Interestingly, disabling the use of mmap() in glibc's memory allocator also solved the error in 456.hmmer_ref (Misaligned store @ 000000000001b8f0), and 450.soplex (User load segfault @ 0x0000000000736010 PC=000000000005e25c).

s117 commented 4 years ago

A note for parameter M_MMAP_MAX

https://github.com/s117/riscv-glibc/blob/06983fe52cfe8e4779035c27e8cc5d2caab31531/malloc/malloc.c

Definition:

L597

L977

L1720

Assignment:

L1777

L5066

L5179

Use:

L2298

s117 commented 4 years ago

Commit d0401da incorporated this patch.