Closed raxod502 closed 3 years ago
Hey, I found out that I can install hhvm-dbg
from dl.hhvm.com to get debugging symbols. Updated the stack trace above.
After installing Latest version of hhvm 4.92 . it is giving an error segmentation fault (core dump). Please fix this urgently . Code base is in hack language. OS : ubuntu 18.04
warning: Error disabling address space randomization: Operation not permitted
... I don't suppose HHVM tries to disable address space randomization or do some other such low-level thing which CircleCI disallows at the kernel level? That would explain why it's working locally, but not on CI.
(Yes, I know that it's GDB here which is trying to disable ASLR, not HHVM. I was just inspired to suggest a possible cause based on this message.)
@raxod502 Any suggestions for finding a temporary solution to this issue? Posting a gdb run dump below for the same.
root@ip-172-31-26-133:/home/ubuntu# hhvm --modules
Segmentation fault (core dumped)
root@ip-172-31-26-133:/home/ubuntu# gdb hhvm
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from hhvm...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/hhvm
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00005555562fa93d in ?? ()
(gdb) r -m server -p 8080
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/bin/hhvm -m server -p 8080
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00005555562fa93d in ?? ()
Trying to figure out whether this issue is identical to the one I am having when running Hack code on hhvm 4.87 or above in Jenkins or travis. Also a segfault when invoking any script with hhvm. Does this issue fail to manifest under 4.86?
@raxod502 Any suggestions for finding a temporary solution to this issue?
Unfortunately, I don't know of any yet. One idea for collecting more debugging information would be to strace hhvm
and see what system calls are invoked around the time of the segfault. I haven't the faintest idea what the problem could be related to, however.
@raxod502 Aws machine : 5.4.0-1029-aws ~18.04.1-Ubuntu SMP Tue Oct 20 11:09:25 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux system HVM domU /0/401 processor Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz /0/1000 memory 1GiB System Memory /0/1000/0 memory 1GiB DIMM RAM
Strace HipHop VM 4.92.0 (rel)
strace hhvm segmentation fault log snippet
mmap(NULL, 8589934592, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xffffffffffffffff} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
But ideally it should proceed like below
mmap(NULL, 10737418240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f62fa300000
mmap(NULL, 8388608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f62f9b00000
munmap(0x7f62f9b00000, 8388608) = 0
mmap(NULL, 10481664, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f62f9901000
munmap(0x7f62f9901000, 1044480) = 0
munmap(0x7f62fa200000, 1048576) = 0
uname({sysname="Linux", nodename="ip-172-31-44-90", ...}) = 0
readlink("/proc/self/exe", "/opt/hhvm/4.84.0/bin/hhvm", 4096) = 25
access("/opt/hhvm/4.84.0/bin/hh_single_compile", X_OK) = 0
readlink("/proc/self/exe", "/opt/hhvm/4.84.0/bin/hhvm", 4096) = 25
openat(AT_FDCWD, "/opt/hhvm/4.84.0/bin/hhvm", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0755, st_size=69657392, ...}) = 0
mmap(NULL, 69657392, PROT_READ, MAP_SHARED, 5, 0) = 0x7f62f5791000
Huh, that looks like a failure in initial memory allocation before even jumping into HHVM code, perhaps the binary is compiled with ASLR disabled in the ELF metadata (readelf -a
) and thus cannot be run in an ASLR-enforced environment? Or something like that. It says ENOMEM
but who knows, that could also be because it's trying to allocate memory in a region that's disallowed because the OS forces ASLR and puts the process heap in a different place than the executable expects. Something something RLIMIT_DATA
, according to man 2 mmap
.
Also reproable with hhvm/user-documentation:HHVM-4.93-2021-01-22-0cb6f6a on EC2, but not on my mac.
Currently problems appear to happen on:
Does not happen on:
My next steps:
On MacOS at least, limiting the memory with the -m
option to docker run
doesn't reproduce the issue, so more likely dockerd settings/version than ram
Fails with segfault on t2.micro in us-west2 with segfault on both Amazon Linux 2018.03 (AMI amzn-ami-hvm-2018.03.0.20190611-x86_64-ebs) and Ubuntu 20.10 (AMI ami-0b227db5ccaf77e94).
Both are using Docker 19.03.13 build 4484c46
Issue does not reproduce on a t2.2xlarge running the same ubuntu 20.10 AMI, even with --memory 128m --memory-swap 0
With a lower limit than that, it tends to OOM rather than sigsegv. This is fun.
# docker run --rm hhvm/hhvm:4.87-latest hhvm /dev/null; echo $?
0
# docker run --rm hhvm/hhvm:4.88-latest hhvm /dev/null; echo $?
139
# docker run --rm hhvm/hhvm:2020.12.12 hhvm /dev/null; echo $?
0
# docker run --rm hhvm/hhvm:2020.12.13 hhvm /dev/null; echo $?
139
% git log --oneline nightly-2020.12.12..nightly-2020.12.13
4347433a64 (tag: nightly-2020.12.13) Disable more watchman tests under retranslate-all
66559eb716 Elaborate contexts on a method just like fun/lambda
8fe2012012 Fix is/as/reified generics test folder names
3206afb939 Kill mt_rand hhbbc optimization
f90e2edc89 Use mmap to reserve the arrprov slab
f90e2edc89 (D25515629) both sounds relevant, and matches the strace
It also segfaults outside of docker on the same machine
Edit: ignore this, fails even on known good versions
On a machine with lots of RAM:
# (ulimit -v $((1024 * 1024 * 8)); hhvm /dev/null; echo $?)
139
# (ulimit -v $((1024 * 1024 * 9)); hhvm /dev/null; echo $?)
0
The always_assert()
doesn't trigger because it's returning MAP_FAILED, which isn't nullptr
FB T83478260
It looks like we should be able to get a hotfixable-patch fairly quickly.
In the mean time, if you have control over the environment, use machines with > 8GB RAM (it's fine to limit it to a smaller amount though)
Also reproable with hhvm/user-documentation:HHVM-4.93-2021-01-22-0cb6f6a on EC2, but not on my mac.
Currently problems appear to happen on:
- CircleCI
- TravisCI
- AWS ElasticBeanstalk-managed t2.micro instances (@vikash-itspe , what type is yours?)
Does not happen on:
- MacOS with plenty of RAM
- ???
My next steps:
- test on a non-EB EC2 t2.micro instance
- test on an EC2 instance with an obscenely large amount of RAM but otherwise identical
So I did tested on aws t2.micro and t2.small with 1 and 2 GiB RAM respectively with segmentation fault result with HHVM version 4.92 vanilla installation with Ubuntu 18.04 server x86_64 operating system.
It looks like we should be able to get a hotfixable-patch fairly quickly.
In the mean time, if you have control over the environment, use machines with > 8GB RAM (it's fine to limit it to a smaller amount though)
Sure. will wait for the fixes with repository releases for Ubuntu machines.
@fredemmott Thank you so much for your quick response and exploration.
For reference, trace to the mmap()
rather than the segfault:
Breakpoint 1, HPHP::arrprov::(anonymous namespace)::getRawTagStorageArray ()
at ./hphp/runtime/base/array-provenance.cpp:85
85 ./hphp/runtime/base/array-provenance.cpp: No such file or directory.
(gdb) bt
#0 HPHP::arrprov::(anonymous namespace)::getRawTagStorageArray () at ./hphp/runtime/base/array-provenance.cpp:85
#1 0x00005555562f756c in HPHP::arrprov::(anonymous namespace)::getTagID (tag={...})
at ./hphp/runtime/base/array-provenance.cpp:108
#2 0x00005555562f7649 in HPHP::arrprov::Tag::Tag (this=0x7fffffffd5b0,
kind=HPHP::arrprov::Tag::Kind::RuntimeLocation, name=<optimized out>, line=<optimized out>)
at /usr/include/c++/9/bits/move.h:74
#3 0x00005555564b0655 in HPHP::arrprov::Tag::RuntimeLocation (filename=<optimized out>)
at ./hphp/runtime/base/array-provenance.h:90
#4 HPHP::RuntimeOption::<lambda()>::operator() (__closure=<optimized out>)
at ./hphp/runtime/base/runtime-option.cpp:1407
#5 HPHP::RuntimeOption::Load (ini=..., config=..., iniClis=..., hdfClis=..., messages=<optimized out>,
messages@entry=0x7fffffffd970, cmd=...) at ./hphp/runtime/base/runtime-option.cpp:1407
#6 0x0000555556462367 in HPHP::execute_program_impl (argc=<optimized out>, argv=<optimized out>)
at /usr/include/c++/9/bits/basic_string.h:936
#7 0x0000555556464bf5 in HPHP::execute_program (argc=1, argv=0x7fffffffe6f8)
at ./hphp/runtime/base/program-functions.cpp:1288
#8 0x0000555556017cb5 in main (argc=1, argv=0x7fffffffe6f8) at ./hphp/hhvm/main.cpp:101
(gdb)
We're likely to land a fix for 4.94 which isn't suitable for backporting.
For prior versions, I'll probably apply this on Monday; it is fine unless you've explicitly enabled some options related to tracking where PHP arrays (as opposed to vecs/dicts) are produced.
diff --git a/hphp/runtime/base/array-provenance.cpp b/hphp/runtime/base/array-provenance.cpp
index 46d3803dfd..ace2e395e2 100644
--- a/hphp/runtime/base/array-provenance.cpp
+++ b/hphp/runtime/base/array-provenance.cpp
@@ -57,7 +57,17 @@ using TagStorage = std::pair<LowPtr<const StringData>, int32_t>;
static constexpr TagID kKindBits = 3;
static constexpr TagID kKindMask = 0x7;
-static constexpr size_t kMaxTagID = (1 << (8 * sizeof(TagID) - kKindBits)) - 1;
+static constexpr size_t kMaxTagID = ((1 << (8 * sizeof(TagID) - kKindBits)) - 1)
+ // Arbitrary reduction to reduce the required memory from 8GB to 8MB to work
+ // around https://github.com/facebook/hhvm/issues/8796
+ //
+ // This limits us to 524288 arrays - but if array provenance is disabled, we
+ // only need tags for arrays created when reading configs, which will be much
+ // less than that.
+ //
+ // This is a tradeoff that means that even if you have enough RAM,
+ // ArrayProvenance can not be safely enabled for large projects.
+ / 1024;
struct TagHashCompare {
bool equal(TagStorage a, TagStorage b) const {
This should be fixed in 4.94.0; building .1 versions of other supported/affected, but need more testing to be sure this is resolved.
Fixed in .1 release of 4.88.1 -> 4.93.1
Describe the bug In some environments, HHVM segfaults on startup.
Standalone code, or other way to reproduce the problem Unfortunately, this problem does not occur for me when running (in Docker) locally, it only occurs (in the same Docker image) on CircleCI, which may point to something kernel-related.
Expected behavior
Actual behavior
Environment
$ hh_client --version hackc-c806e3b8fcef4fbf7135a864e4dd14b628934684-4.92.0
Linux runtime 5.8.0-7630-generic #32~1607010078~20.04~383a644-Ubuntu SMP Thu Dec 3 19:14:47 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Linux runtime 5.4.0-1021-gcp #21-Ubuntu SMP Fri Jul 10 06:53:47 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux