Open pkramme opened 4 years ago
@aclements
@pkramme Hi, would you mind to post your Apache setup for this too? I did not reproduce this on a fresh Apache. Maybe it is due to my configuration.
@pkramme is this shared hosting environment cPanel by any chance?
We also started getting reports of this same panic with our Go application, which exposes itself as a .live.cgi
FastCGI net/http/cgi
server integrating with cPanel's LiveAPI, as soon as we upgraded to 1.14.
Going to downgrade to 1.13.9 for now.
@aleksator No, it is not, it is a custom build setup. @ivzhh I'm not able to share the config, as it is proprietary.
Theoretically, if we execute any code with 512MB memory limitation, the problem should become visible. I will try to produce something not based on fcgi as a reproducer, so that no apache2 setup is necessary.
Tagging a proper person here: @alexzorin
I think what @pkramme suggested about the 512MB memory limit is correct - specifically RLIMIT_AS
.
"Back in the day" (EL5-ish era), shared web hosting admins did not have access to the RSS cgroups controller (because of EL5's ancient kernel), and so controlling VSZ limits was the only choice available to them. In the long term, this has resulted in a lot of misguided admins keeping these VSZ limits around for no good reason.
Anyway, the Apache-based reproducer is straightforward. (For some reason, a simple Go hello world wrapped in a bash ulimit -v
didn't repro for me, not sure why).
net/http/cgi
binary using Go 1.14.1 and stick it in Apache httpd 2.4's cgi-bin/
:package main
import (
"net/http/cgi"
)
func main() {
if err := cgi.Serve(nil); err != nil {
panic(err)
}
}
go build -o /var/www/html/cgi-bin/reproducer.cgi reproducer.go
Configure Apache with a 512MB RLimitMEM and restart Apache (note, don't try this in Docker or LXC-like environments, setrlimit
will just fail and the repro won't work):
RLimitMEM 536870912
apachectl -k restart
Access http://localhost/cgi-bin/reproducer.cgi. It will produce an HTTP 500, and in Apache's error_log
, you will see the panic stack from the original report.
I would prefer not to ask our customers to remove the rlimit (or else we'll be stuck shipping with Go 1.13 for all eternity).
Would it be practical for the Go runtime to try work within whatever it sees by getrlimit
?
Adding that I'm also seeing this issue in a different memory-limited environment with a 512mb limit (a Grid Engine setup). Raising the memory limit to 950mb fixes the issue, but it's unclear to me why it should ever be an issue anyway - the program does not use that much memory during running.
apologies for the "me too" post, but this has also prevented to migrate a little Go-based "script" of one of my colleagues at CERN from Go-1.13.x to the latest Go-1.14.x.
Well, after reinvestigating this issue I stumbled over the proposal for the new page allocator which was introduced in go1.14: https://github.com/golang/proposal/blob/master/design/35112-scaling-the-page-allocator.md
There are only two known adverse effects of this large mapping on Linux:
- ulimit -v, which restricts even PROT_NONE mappings.
- Programs like top, when they report virtual memory footprint, include PROT_NONE mappings.
In the grand scheme of things, these are relatively minor consequences. The former is not used often, and in cases where it is, it's used as an inaccurate proxy for limiting a process's physical memory use. The latter is mostly cosmetic, though perhaps some monitoring system uses it as a proxy for memory use, and will likely result in some harmless questions.
So, this explains it. @aclements Is there a workaround for cases like this?
cc @mknyszek
As @pkramme points out, we were aware of this issue when the changes to the page allocator were proposed. As @alexzorin points out, ulimit -v
is an out-dated mechanism for limiting memory use.
I would prefer not to ask our customers to remove the rlimit (or else we'll be stuck shipping with Go 1.13 for all eternity).
Would it be practical for the Go runtime to try work within whatever it sees by
getrlimit
?
The short answer is no. The virtual memory mappings made to support structures in the page allocator significantly simplified the improvements made in the 1.14 release. Earlier on in the release cycle the amount of memory mapped was much larger which caused problems on certain platforms where the default ulimit -v
value for default users was fairly low, so out-of-the-box Go programs would not work on an out-of-the-box system without having additional privileges (see #35568). This is generally not true on Linux where ulimit -v
is unlimited by default (at least for the versions I'm aware of). We took steps to reduce the size of these mappings at the cost of additional complexity and a small performance regression. We experimented a little with additional mitigations but concluded they weren't practical.
@sbinet @mashedkeyboard @alexzorin @pkramme:
In order to understand your situations better, could you elaborate on the reasons why your and/or your customers cannot set RLIMIT_AS
/ulimit -v
to unlimited, or an otherwise sufficiently high number for your Go programs?
As a side note, (and to be totally clear, I'm not recommending this as an official workaround) compiling your code with GOARCH=386
should allow your code to run on amd64
platforms with a low ulimit -v
, since the memory mapping we make is proportional to the size of the address space and the address space is much smaller on 386. I recognize that this has its issues, and is not generally a feasible alternative. The most notable issues that come to mind are that your code might run slower (due to 32-bit registers and a lack of certain intrinsics) or some libraries you code depends on might not support 32-bit platforms (I'm not sure how common it is for libraries to support amd64
but not 386
, but it is possible).
could you elaborate on the reasons why your and/or your customers cannot set RLIMIT_AS/ulimit -v to unlimited, or an otherwise sufficiently high number for your Go programs?
This is our plan. It's going to be a challenge for XX,000 hosts between X,000 customers, so we are first planning to add telemetry to our 1.13 builds to see how many systems run the CGI under restricted virtual memory.
Sure!
The Go software I am writing is supposed to run on a shared hosting server. The technical foundation is a LAMP (linux apache2 mysql php/python/...) stack. Inside a shared hosting LAMP stack, the Apache2 webserver is spawning CGI/FastCGI software inside a restricted environment, which is heavily controlled in access and resources by the provider, in order to prevent one user taking up all the resources. On my shared hosting account one FastCGI process is limited to 512MB "memory".
The important part is that in managed hosting, you simply cannot make that change, because you do not control the environment. The only possibility for me is to upgrade to another, more expensive hosting plan, so that I can use 1GB memory or more so that this allocation works.
We don't have much lever on how to configure the CGI environment. and CERN-IT is a bit conservative w/ changing the configuration of services they provide for their physicists (who are sometimes a bit "cavalier" with how they setup their things.)
nonetheless, I've sent a ticket on raising the RLIMIT_AS
.
I've also passed on to my colleague the 32b workaround.
we'll see.
(anyways, it's not a high profile CGI service, we won't miss supersymetry nor mini-blackholes, or loose the beam if we're stuck w/ Go-1.13.x b/c of that. :P)
@mknyszek Is there any progress on this on your end?
we are first planning to add telemetry to our 1.13 builds to see how many systems run the CGI under restricted virtual memory.
To put a conclusion on this from my end, we gathered some RLIMIT_AS
stats and the number of affected users is around 0.5%. The majority have a limit of 4096MB set on Apache, which is the vendor default on this platform.
As long as Go continues to work within that limit, we're happy to live with it and will ask those other users to adapt. Thanks.
any progress on it?
Can't believe this remains unresolved. Are we supposed to downgrade golang to 1.13?
any progress on it?
I don't know that anybody knows of a feasible way to fix this. The Go memory system expects that address is space is available. Address space costs nothing. It makes sense to constrain program's use of actual memory. It does not make sense to constrain program's use of address space.
I don't know that anybody knows of a feasible way to fix this. The Go memory system expects that address is space is available. Address space costs nothing. It makes sense to constrain program's use of actual memory. It does not make sense to constrain program's use of address space.
I am not really an expert in memory use / allocation. I am not even sure what you mean by address space. If it is only testing and not using. ;) The fact is I wasted quite a bit of time before noticing it was related to this imposed memory limit somewhere of the parent application This makes me also wonder about container environments applying resource limits, are these affected as well by this?
I am not really an expert in memory use / allocation. I am not even sure what you mean by address space. If it is only testing and not using. ;)
See https://go.dev/doc/gc-guide#A_note_about_virtual_memory.
This makes me also wonder about container environments applying resource limits, are these affected as well by this?
They are not. Container limits pertain to actual physical memory usage (RSS in top
) not virtual memory footprint (VSS in top
).
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I am trying to get an FastCGI server running behind an Apache2 webserver on a shared hosting system using the net/http/fcgi library. The webserver is limiting my software to 512MB memory.
This is the code: https://play.golang.org/p/Z-Gc6icOpw5
What did you expect to see?
I expect to see "This was generated by Go running as a FastCGI app" on the website generated by the FastCGI server.
What did you see instead?
I have modified the sysReserve() function in the runtime to include println() to print out the error code from mmap() and the requested memory size. This is a diff of
src/runtime/mem_linux.go
and my version:I kept the output in the following output in the hopes that it might be useful.
The application crashes with this trace:
The application works fine with golang 1.13.9.
I have no idea how to debug this further.