facebook / hhvm

A virtual machine for executing programs written in Hack.
https://hhvm.com
Other
18.13k stars 2.98k forks source link

HHVM segmentation fault in PDORequestData::requestShutdown() #5016

Closed edwh closed 9 years ago

edwh commented 9 years ago

This is the first HHVM issue I've reported; apologies if I don't follow the right etiquette.

I'm seeing a crash on one specific newly installed machine, with code that works elsewhere. I'm a bit baffled as to why.

Here's the best stack trace I can get written to file.

Host: grace.careicon.com
ProcessID: 11773
ThreadID: 7fd752bff700
ThreadPID: 11781
Name: unknown program
Type: Segmentation fault
Runtime: hhvm
Version: tags/HHVM-3.6.0-0-g6ef13f20da20993dc8bab9eb103f73568618d3e8
DebuggerCount: 0

ThreadType: Web Request
Server_SERVER_NAME: www.careicon.com
Server: dev.careicon.com
URL: /api/session_get.php

#0  HPHP::PDORequestData::requestShutdown() at /usr/bin/hhvm:0
#1  HPHP::ExecutionContext::onRequestShutdown() at /usr/bin/hhvm:0
#2  HPHP::hphp_context_shutdown() at /usr/bin/hhvm:0
#3  HPHP::HttpRequestHandler::executePHPRequest(HPHP::Transport*, HPHP::RequestURI&, HPHP::SourceRootInfo&, bool) at /usr/bin/hhvm:0
#4  HPHP::HttpRequestHandler::handleRequest(HPHP::Transport*) at /usr/bin/hhvm:0
#5  HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJobImpl(std::shared_ptr<HPHP::FastCGIJob>, bool) at /usr/bin/hhvm:0
#6  HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJob(std::shared_ptr<HPHP::FastCGIJob>) at /usr/bin/hhvm:0
#7  HPHP::JobQueueWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::Server*, true, false, HPHP::JobQueueDropVMStack>::start() at /usr/bin/hhvm:0
#8  HPHP::AsyncFuncImpl::ThreadFunc(void*) at /usr/bin/hhvm:0
#9  HPHP::start_routine_wrapper(void*) at /usr/bin/hhvm:0
#10 start_thread at /build/buildd/eglibc-2.19/nptl/pthread_create.c:312
#11 clone at /build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113

If I attach to the dbg HHVM with gdb, I get this at the point of the crash:

#0  0x0000000000000000 in ?? ()
#1  0x00000000021f05e3 in HPHP::PDORequestData::requestShutdown (this=0x7f9aa45fa980) at /tmp/tmp.1gvuVgdxVk/hphp/runtime/ext/pdo/ext_pdo.cpp:942
#2  0x0000000001781e56 in HPHP::ExecutionContext::onRequestShutdown (this=this@entry=0x7f9aa1006030) at /tmp/tmp.1gvuVgdxVk/hphp/runtime/base/execution-context.cpp:551
#3  0x00000000015f51bb in HPHP::hphp_context_shutdown () at /tmp/tmp.1gvuVgdxVk/hphp/runtime/base/program-functions.cpp:1955
#4  0x0000000001838e76 in HPHP::HttpRequestHandler::executePHPRequest (this=this@entry=0x7f9aa48d66a0, transport=transport@entry=0x7f9aa3c92718, reqURI=..., sourceRootInfo=...,
    cacheableDynamicContent=<optimised out>) at /tmp/tmp.1gvuVgdxVk/hphp/runtime/server/http-request-handler.cpp:538
#5  0x000000000183a770 in HPHP::HttpRequestHandler::handleRequest (this=0x7f9aa48d66a0, transport=0x7f9aa3c92718)
    at /tmp/tmp.1gvuVgdxVk/hphp/runtime/server/http-request-handler.cpp:382
#6  0x000000000184506c in HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJobImpl (this=0x7f9aa3effd40, job=..., abort=abort@entry=false)
    at /tmp/tmp.1gvuVgdxVk/hphp/runtime/server/server-worker.h:107
#7  0x00000000018453b9 in HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJob (this=<optimised out>, job=...)
    at /tmp/tmp.1gvuVgdxVk/hphp/runtime/server/server-worker.h:57
#8  0x0000000001843aa4 in HPHP::JobQueueWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::Server*, true, false, HPHP::JobQueueDropVMStack>::start (this=0x7f9aa3effd40)
    at /tmp/tmp.1gvuVgdxVk/hphp/util/job-queue.h:463
#9  0x0000000001841196 in HPHP::AsyncFunc<HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits> >::run_ (obj=<optimised out>)
    at /tmp/tmp.1gvuVgdxVk/hphp/util/async-func.h:213
#10 0x000000000256db8b in HPHP::AsyncFuncImpl::threadFuncImpl (this=this@entry=0x7f9aa3c0a400) at /tmp/tmp.1gvuVgdxVk/hphp/util/async-func.cpp:131
#11 0x000000000256ddd7 in HPHP::AsyncFuncImpl::ThreadFunc (obj=0x7f9aa3c0a400) at /tmp/tmp.1gvuVgdxVk/hphp/util/async-func.cpp:51
#12 0x0000000001731432 in HPHP::start_routine_wrapper (arg=0x7f9aa3cfb1e0) at /tmp/tmp.1gvuVgdxVk/hphp/runtime/base/thread-hooks.cpp:93
#13 0x00007f9ab4f6d182 in start_thread (arg=0x7f9aa1bff700) at pthread_create.c:312
#14 0x00007f9ab447a47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

I've tried the nightly build but I still see the same issue.

Suggestions very welcome.

mirzap commented 9 years ago

I can confirm this, I have the same issue:

ProcessID: 9398
ThreadID: 7f858e3ff700
ThreadPID: 9405
Name: unknown program
Type: Segmentation fault
Runtime: hhvm
Version: tags/HHVM-3.6.0-0-g6ef13f20da20993dc8bab9eb103f73568618d3e8
DebuggerCount: 0

Server: www.web.dev
ThreadType: Web Request
Server_SERVER_NAME: www.web.dev
URL: /

# 0  HPHP::PDORequestData::requestShutdown() at /usr/bin/hhvm:0
# 1  HPHP::ExecutionContext::onRequestShutdown() at /usr/bin/hhvm:0
# 2  HPHP::hphp_context_shutdown() at /usr/bin/hhvm:0
# 3  HPHP::HttpRequestHandler::executePHPRequest(HPHP::Transport*, HPHP::RequestURI&, HPHP::SourceRootInfo&, bool) at /usr/bin/hhvm:0
# 4  HPHP::HttpRequestHandler::handleRequest(HPHP::Transport*) at /usr/bin/hhvm:0
# 5  HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJobImpl(std::shared_ptr<HPHP::FastCGIJob>, bool) at /usr/bin/hhvm:0
# 6  HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJob(std::shared_ptr<HPHP::FastCGIJob>) at /usr/bin/hhvm:0
# 7  HPHP::JobQueueWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::Server*, true, false, HPHP::JobQueueDropVMStack>::start() at /usr/bin/hhvm:0
# 8  HPHP::AsyncFuncImpl::ThreadFunc(void*) at /usr/bin/hhvm:0
# 9  HPHP::start_routine_wrapper(void*) at /usr/bin/hhvm:0
# 10 start_thread at /build/buildd/eglibc-2.19/nptl/pthread_create.c:312
# 11 clone at /build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113

Any help solving this is appreciated.

kilahm commented 9 years ago

I can also confirm this. My core dump looks the same as those above.

Host: 3fadfce8fc0a
ProcessID: 1
ThreadID: 7f21f9fff700
ThreadPID: 14
Name: unknown program
Type: Segmentation fault
Runtime: hhvm
Version: tags/HHVM-3.6.0-0-g6ef13f20da20993dc8bab9eb103f73568618d3e8
DebuggerCount: 0

ThreadType: Web Request
Server_SERVER_NAME: api.dyne.local
Server: api.dyne.local
URL: /v2/auth

# 0  ?? at hhvm:0
# 1  HPHP::PDORequestData::requestShutdown() at hhvm:0
# 2  HPHP::ExecutionContext::onRequestShutdown() at hhvm:0
# 3  HPHP::hphp_context_shutdown() at hhvm:0
# 4  HPHP::HttpRequestHandler::executePHPRequest(HPHP::Transport*, HPHP::RequestURI&, HPHP::SourceRootInfo&, bool) at hhvm:0
# 5  HPHP::HttpRequestHandler::handleRequest(HPHP::Transport*) at hhvm:0
# 6  HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJobImpl(std::shared_ptr<HPHP::FastCGIJob>, bool) at hhvm:0
# 7  HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJob(std::shared_ptr<HPHP::FastCGIJob>) at hhvm:0
# 8  HPHP::JobQueueWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::Server*, true, false, HPHP::JobQueueDropVMStack>::start() at hhvm:0
# 9  HPHP::AsyncFuncImpl::ThreadFunc(void*) at hhvm:0
# 10 HPHP::start_routine_wrapper(void*) at hhvm:0
# 11 start_thread at /lib/x86_64-linux-gnu/libpthread.so.0:0
# 12 clone at /lib/x86_64-linux-gnu/libc.so.6:0

PHP Stacktrace:
paulbiss commented 9 years ago

Possibly related to fc57e016493d8e4177930c6482e47d87c1af31ed which made PDO resources smart allocated and created a lightweight persistent connection object.

edwh commented 9 years ago

That's useful - if I remove the persistent attribute, then it avoids the crash.

mirzap commented 9 years ago

Nice. I also disabled persistent connection, and it avoids the crash.

kilahm commented 9 years ago

That also fixed my crash.

paulm17 commented 9 years ago

Ignore. Yeah this resolved it for me too.

bajb commented 9 years ago

We had limited success as well. Ended up having to downgrade again. On 25 Mar 2015 19:17, "Paul Moss" notifications@github.com wrote:

Can you guys confirm what you are doing?

From this:

new \PDO('mysql:host=127.0.0.1;dbname=baz', 'foo', 'bar', array( PDO::ATTR_PERSISTENT => true ));

To this:

new \PDO('mysql:host=127.0.0.1;dbname=baz', 'foo', 'bar');

I am still getting segfaults with a non-persistent connection. Although it doesn't segfault as quickly.

— Reply to this email directly or view it on GitHub https://github.com/facebook/hhvm/issues/5016#issuecomment-86179342.

gfleury commented 9 years ago

I've made some modifications to keep persistence connections working, just by removing the old PDOResource persistence implementation, there is an PR for this #5091

jippi commented 9 years ago

:+1: still seeing the issue on latest nightly

gfleury commented 9 years ago

Was not merged yet.

therealssj commented 9 years ago

Same here added persistent attribute and started crashing, removed it and its gone but why does it happen??

mxw commented 9 years ago

See https://github.com/facebook/hhvm/issues/5202

igorclark commented 9 years ago

Hi, I'm getting exactly the same, it's since I upgraded to 3.7.0 from the HHVM debian repo:

deb http://dl.hhvm.com/debian wheezy main`:
$ hhvm --version
HipHop VM 3.7.0 (rel)
Compiler: tags/HHVM-3.7.0-0-gc8baf9cd3cb603e030969bfe24634d5e85549915
Repo schema: 1988ce75e6a571ce9c79c60fd7dc3ca8a5a5d403

It's just in dev at the moment, as I'm experimenting to see if we can move over to HHVM. I think it was version 3.6.0 before the upgrade, taken from the repo about 6 weeks ago, and as far as I recall, that was working nicely with persistent connections, running via FastCGI over a unix socket. Now, without any other changes to HHVM setup or config, I get this:

$ cat /tmp/stacktrace.572.log 
Host: nginx
ProcessID: 573
ThreadID: 7f8107fff700
ThreadPID: 579
Name: unknown program
Type: Segmentation fault
Runtime: hhvm
Version: tags/HHVM-3.7.0-0-gc8baf9cd3cb603e030969bfe24634d5e85549915
DebuggerCount: 0

ThreadType: Web Request
Server_SERVER_NAME: 192.168.56.101
Server: 192.168.56.101
URL: /

# 0  HPHP::PDORequestData::requestShutdown() at /usr/bin/hhvm:0
# 1  HPHP::ExecutionContext::onRequestShutdown() at /usr/bin/hhvm:0
# 2  HPHP::hphp_context_shutdown() at /usr/bin/hhvm:0
# 3  HPHP::HttpRequestHandler::executePHPRequest(HPHP::Transport*, HPHP::RequestURI&, HPHP::SourceRootInfo&, bool) at /usr/bin/hhvm:0
# 4  HPHP::HttpRequestHandler::handleRequest(HPHP::Transport*) at /usr/bin/hhvm:0
# 5  HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJobImpl(std::shared_ptr<HPHP::FastCGIJob>, bool) at /usr/bin/hhvm:0
# 6  HPHP::ServerWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::FastCGITransportTraits>::doJob(std::shared_ptr<HPHP::FastCGIJob>) at /usr/bin/hhvm:0
# 7  HPHP::JobQueueWorker<std::shared_ptr<HPHP::FastCGIJob>, HPHP::Server*, true, false, HPHP::JobQueueDropVMStack>::start() at /usr/bin/hhvm:0
# 8  HPHP::AsyncFuncImpl::ThreadFunc(void*) at /usr/bin/hhvm:0
# 9  HPHP::start_routine_wrapper(void*) at /usr/bin/hhvm:0
# 10 start_thread at /lib/x86_64-linux-gnu/libpthread.so.0:0
# 11 __clone at /lib/x86_64-linux-gnu/libc.so.6:0

PHP Stacktrace:

(No stacktrace is shown.)

This code reliably crashes:

$pdo_options = [PDO::ATTR_PERSISTENT => true];
$conn_id = new PDO($pdo_connection_string, $pdo_username, $pdo_password, $pdo_options);

These two pieces of code reliably don't crash:

$pdo_options = [PDO::ATTR_PERSISTENT => false];
$conn_id = new PDO($pdo_connection_string, $pdo_username, $pdo_password, $pdo_options);
$pdo_options = [];
$conn_id = new PDO($pdo_connection_string, $pdo_username, $pdo_password, $pdo_options);

Without the persistent connections, absolutely everything else on the site works.

I'd really like to use persistent connections - it's kind of a blocker for moving to HHVM. I've followed through to #5202 and #3414 but they don't seem to shed any more light. It looks from the above comments like there was a fix merged, but it definitely isn't working in 3.7.0 from dl.hhvm.com on debian_version 7.8. It's running via FastCGI over a unix socket.

Wonder if someone who knows more about this could take a look, or point me in the right direction if I'm missing something?

Thanks!

paulbiss commented 9 years ago

I've cherry-picked this fix to the 3.6 and 3.7 branches.

igorclark commented 9 years ago

Thanks Paul, fantastic. Appreciate it.

We've been using the dl.hhvm.com debian repo for all our VM image builds - will this fix make it into the package there, or should we build the 3.6 / 3.7 branch from source to get it? Sorry if that's a dumb question, still learning my way round the HHVM world. Doesn't seem to be in the repo yet, but that shows a last update yesterday afternoon, so maybe it's on a schedule for later.

Anyway, thanks!


Edit - the repo now shows today's date, but a fresh VM with the hhvm 3.7.0 package installed from the repo still has the bug so I guess it hasn't made it through (yet?).

paulbiss commented 9 years ago

We have nightly packages for master that contain the fix. For 3.6 and 3.7 the next time we build a package (usually in response to a CVE) this fix will go out, if you need it before then building from source is also an option.

igorclark commented 9 years ago

Thanks Paul, that's great. Just tried the nightly build, which seems to have fixed the problem with persistent connections. Using non-persistent connections still seems to leave connections open, but as long as persistent connections are working, that's obviously less of an issue :-) Will do some thrashing of our app on top of the nightly build and make a call on using that vs. waiting for the v3.8.0 release in mid-June.

Appreciate your help!

igorclark commented 9 years ago

Hi again, as per my comment on #5132, I've been testing this in a staging environment where it seems to work for non-persistent connections as well. Doing more testing now. Thanks!