Open laoshaw opened 3 years ago
Unfortunately I got caught up with a few things today but I'll take a look tomorrow if I can grab our MIPS builder (shouldn't be a problem, CI is pretty quiet this close to release).
Also this sound suspiciously familiar. The GOARCH=mips GOMIPS=softfloat
is reminding me of #39174, which was closed before I ended up actually looking into it. That makes me think there actually is a bug here and somehow the runtime thinks 32-bit mips has a much larger address space than it actually does.
I just quickly did some of the math for 32-bit mips by hand in case there's some crazy overflow happening here or something, since technically we treat 32-bit mips as having a 31-bit address space.
The math seems to work out though, at least for the page allocator. It comes out to around 4680 bytes (then rounded up a physical page in size) for the summary structure, plus another 32 KiB for the page bitmap. It's still possible the page allocator is the problem, there could still be a bug or it's possible I made a mistake. Best thing would be to just run this on the right hardware with some extra print statements to understand what's going on.
if you have some testing code with the print embedded I'm happily to run them and send you the output, I have the hardware nearby. Thanks for looking into this quickly.
Alrighty, so I dug into this and the large reservation appears to be WAI. The reservation is not from the changes in Go 1.14; this code is quite old. (And silly me, I actually knew we did this; I had to update the relevant code some time ago, but the 600 MiB total really threw me off.)
On 32-bit platforms there are two big up-front reservations. First, we attempt to make a large reservation up-front for heap bookkeeping data structures. Then we make a large reservation for the heap itself because we're concerned about fragmenting the address space (see https://cs.opensource.google/go/go/+/master:src/runtime/malloc.go;l=551;bpv=1).
Note that this reservation is strictly PROT_NONE
and pieces are mapped in as read-write gradually, only as needed. Linux overcommit should be ignoring the parts of the reservation that are never touched.
So that solves the puzzle of where the VSS is coming from. But the other question is why you're running into overcommit issues. I suspect that this may be due to some heap bookkeeping data structures which are not yet used (and thus don't appear in RSS) but are committed as read-write. For instance, there's a bitmap that accounts for about 3% overhead of the peak heap usage, and this is all memory that stays committed.
But that doesn't square with some of the values I'm seeing. On a real linux/mips
machine I printed out the sizes passed to every mmap call we make:
sysAlloc 262144
sysReserve 8192
sysMap 8192
sysReserve 135366656
sysReserve 541065216
sysMap 4194304
sysMap 266240
sysAlloc 65536
sysAlloc 65536
sysAlloc 65536
sysAlloc 262144
sysReserve
lines are strictly PROT_NONE
mappings, sysMap
are applications of PROT_READ|PROT_WRITE
to existing PROT_NONE
mappings, and sysAlloc
is a new mapping that's PROT_READ|PROT_WRITE
.
The total amount of committed memory is thus 5189632 bytes, or just about 5 MiB. The physical page size on this machine is 8 KiB. If the page size is higher on your machine, these numbers will likely be slightly higher (round all of them up to that size).
You mentioned in the golang-nuts thread running about 40 of these simultaneously, correct? By my calculations, the most you could run if you weren't using any other memory on the system would be 25 before overcommit fired. It's also not super surprising to me that overcommit fires first in your case because a lot of this memory (including that 4 MiB read-write mapping, which is the first heap arena) hasn't been touched. The application is otherwise sitting idle, so it's not necessarily using all the space allocated for the various data structures, and it may not have used the whole heap, either.
So, that's my best analysis of the situation.
Regarding the large VSS usage, that's WAI and a problem for ulimit -v
, but the code has worked like that for a very long time. We could talk about reducing the up-front reservations, but it won't help your immediate problem with overcommit.
Your immediate problem can likely only be remedied by having a smaller minimum heap size (the current is 4 MiB) and mapping in the heap more incrementally. A smaller minimum heap size is blocked hard on #42430 which I'm actively investing time into resolving, but will take time to complete.
Because addressing this better depends somewhat on #44167 (which should hopefully enable a smaller heap minimum), and I had to push that back, I need to push this back as well. My apologies.
Putting this into the backlog.
Though, the minimum heap size for 1.18 has been decreased to 512 KiB with #44167. The heap is still mapped in 4 MiB increments however, so I'm not sure this helps much with the overcommit issue.
Not only mips32, there happened on myl x86_64 device
CentOS Linux release 7.4.1708 (Core)
3.10.0-693.21.1.el7.x86_64
Test code below:
import (
"io"
"log"
"net/http"
_ "net/http/pprof"
)
func main() {
helloHandler := func(w http.ResponseWriter, req *http.Request) {
io.WriteString(w, "Hello, world!\n")
}
http.HandleFunc("/hello", helloHandler)
log.Fatal(http.ListenAndServe(":9000", nil))
}
// and the memory occupied below:
VmPeak: 1057596 kB
VmSize: 994688 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 6180 kB
VmRSS: 5732 kB
VmData: 983296 kB
VmStk: 132 kB
VmExe: 2648 kB
VmLib: 2036 kB
VmPTE: 140 kB
VmSwap: 0 kB
Any methods to limit the vss occupied? @mknyszek
ping...
Any solution?
@Gypsying @IAkumaI A large up-front virtual memory mapping is working as intended, as I mentioned in https://github.com/golang/go/issues/43699#issuecomment-761234927. What exactly is the problem you're running into? Are you using ulimit -v
?
Hi All,
Is there any progress for this issue? Now I met this issue in ARM 7 32bit platform. Thanks
Please share more about your situation.
ulimit -v
?See https://github.com/golang/go/issues/43699#issuecomment-761234927. Basically, a large up-front mapping is working as intended to avoid fragmentation. Currently that address space is not mapped as read/write, only reserved. If your application makes other mappings, I could see that being a problem potentially, but 512 MiB (the size of the mapping made on 32-bit platforms) is only 12.5% of the address space (25% on mips which has a 31 bit address space, which it just occurs to me is where the original issue was coming from... perhaps on mips the size of the initial mapping should be reduced).
Q: Why is a large up-front mapping an issue for you? Are you using ulimit -v? A: In my application, I am trying to load several .so libraries which are programmed by golang. No, I am not using ulimit -v. Q: What version of Go are you using? A: 1.19.2 Q:What operating system are you running Go on? A: Yocto Embedded Linux Q: What is the precise failure you're encountering? A: "runtime: out of memory: cannot allocate 4194304-byte block (7536640 in use)" and I found each .so library will consume a lot of virtual memory and then reach the limitation of 32-bit addressing capability(4G).
Thanks for the info! Hm... are you able to share the full stack trace in that error? My best guess based on the implementation is that the runtime will actually try to map less heap space if it can't get what it wants, but I figure what could be happening is a non-heap mapping is failing. (Then again, the "4194304-byte block" part of this seems to indicate it is indeed trying to map the heap).
If it really is the address space reservation, one thing we could do is tell the runtime to map less if its built as a shared object. Which build mode are you using? Are you essentially building a .so with a C interface?
Hi mknyszek,
You may right, the culprit could be the address space reservation. I try to analyse /proc/[pid]/maps and get the virtual memory consumption as below, vms_edge_host_1.txt. Further more, I tried to visualize the consumption as below with a python script (please refer to the attachment vms.7z), maps_1.csv .
From the visualization of virtual memory consumption, we could see there're two big reservation as below,
Thanks for the info! Hm... are you able to share the full stack trace in that error? My best guess based on the implementation is that the runtime will actually try to map less heap space if it can't get what it wants, but I figure what could be happening is a non-heap mapping is failing. (Then again, the "4194304-byte block" part of this seems to indicate it is indeed trying to map the heap).
If it really is the address space reservation, one thing we could do is tell the runtime to map less if its built as a shared object. Which build mode are you using? Are you essentially building a .so with a C interface?
The build mode is c-shared. I am building the .so with a C interface. How could we tell the runtime to map less if it's built as a shared object? thanks
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Not sure
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I build a helloworld net/http using :
GOOS=linux GOARCH=mips GOMIPS=softfloat go build -ldflags="-s -w"
and run it on a mips32 board, use top to notice its VSS is 700MB and RSS is 4MB. run multiple of them in parallel will cause 'out of memory' when I set overcommit_memory to 2 very quickly. I have 128MB RAMThe code I use:
What did you expect to see?
I would expect a much smaller VSS on a 128MB 32-bit system, to compare my C/C++ http process only need about 4MB for VSS and 1MB for RSS
What did you see instead?
700MB VSS and easily 'out of memory' errors.