IPC leader does not wait for all processes to finish

boryspoplawski commented 3 years ago

Description of the problem

If the first process, which is the IPC leader, exits, all other processes cannot use IPC any more. We need to decide on a first process exit policy and implement it. I see two options:

IPC leader kills all other processes before exiting
IPC leader just cleans up app resources, but stays alive as long as other processes are still running.

We already have a TODO about it: https://github.com/oscarlab/graphene/blob/4f8d6fb8b323d62f00be2ac2c7239878b14afc1e/LibOS/shim/src/sys/shim_exit.c#L23-L27

mkow commented 3 years ago

The second option seems better to me (I think it's fine if the main process of a user app just spawns something and then exits right away leaving the child).

dimakuv commented 3 years ago

(I think it's fine if the main process of a user app just spawns something and then exits right away leaving the child)

But then Graphene will have unexpected behavior: the caller of this user app expects the main process to exit immediately, but instead the caller will hang because the main process is still alive. E.g., when we start Memcached in daemon mode, we expect memcached -d to return immediately, and we continue with our Bash session, but with this second option, the process will simply live forever (until kill on its child). This sounds like a surprising side effect for end users.

boryspoplawski commented 3 years ago

@dimakuv Well, that call would have graphene-something prepended so it wouldn't be that surprising. Also going with the first option would just... kill the memcached.

dimakuv commented 3 years ago

Yes, both options don't work for this daemon case. I was thinking about "electing a new IPC leader" third option, but this sounds... complex.

If choosing between these two options, I also vote for the second option.

boryspoplawski commented 3 years ago

Yes, both options don't work for this daemon case.

It does not work in the sense it will stay up and not become a "real daemon". But I think if you want to have Graphene as a daemon process on the host, then you should wrap it in some script doing that. Or, in some future, we could add a -d flag to Graphene invocation which would daemonize Graphene as a whole.

dimakuv commented 3 years ago

A list of related issues that have this "IPC leader does not wait for all processes to finish" bug:

We closed these issues since they are tracked here. Any new issues that boil down to this bug will be closed and mentioned in this meta-issue. The closing of other issues is to reduce the number of open issues, for easier management. (In other words, it doesn't invalidate the closed issues, it's just that it's very hard to navigate through a huge list of issues.)

boryspoplawski commented 3 years ago

@dimakuv wait, most of the linked issues exhibit the bug described here, but this bug is not the main reason they fail (it just causes some error messages, but does not make the test fail - that happen before these error msgs). gramineproject/graphene#2541 is a good example

anjalirai-intel commented 3 years ago

Hi @dimakuv

I am attaching the list of tests which fails with IPC issue here for the verification purpose. Some of these tests are not placed in original bug.

waitpid04
waitpid03
fcntl12
recvfrom01
send01
connect01
recv01
sendto01
fcntl12_64
kill09

boryspoplawski commented 3 years ago

What is worth noting is that these tests do not fail, they just print an error message at the end, but still succeed.

boryspoplawski commented 3 years ago

I am attaching the list of tests which fails with IPC issue here for the verification purpose. Some of these tests are not placed in original bug.

I've verified that list - non of these fail due to the problem described in this issue. Some of them do sometimes print an error message described here, but it happens already after the test finished and is unrelated to actual result of the test (be it fail or success).

boryspoplawski commented 3 years ago

On the original topic: I don't think implementing 2nd option will be that easy: Graphene does not bother cleaning most of the stuff up at exit (depending on host to clean it, which is not a problem atm), so this would require a lot of changes (e.g. closing all files, releasing all handles, cleaning up all threads and all user memory, lot of other cleanups).

jinengandhi-intel commented 2 years ago

We continue to see the issue as of Feb 21st. Copied are a few console links from opensource-CI where the errors were seen: https://localhost:8080/job/graphene-20.04/2237/testReport/LibOS.shim.test.ltp/test_ltp/test_direct___test_ltp_waitpid04_/ https://localhost:8080/job/graphene-20.04/2256/testReport/LibOS.shim.test.ltp/test_ltp/test_direct___test_ltp_waitpid04_/ https://localhost:8080/job/graphene-20.04/2251/testReport/LibOS.shim.test.ltp/test_ltp/test_direct___test_ltp_kill09_/

marcelamelara commented 1 year ago

Hello! I wanted to bump this issue since an application I'm trying to run in Gramine (gramine-direct) is failing because the IPC leader exits before its child process is done (see log snippet below). Is there a workaround for this that I could test?

Gramine log snippet:

(libos_parser.c:1609:buf_write_all) [P2:T36:Listener] trace: ---- execve("/home/mmelara/app/bin/Worker", [/home/mmelara/app/bin/Worker,spawnclient,121,124,], [DOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER=0,https_proxy=http://proxy:912/,]) ...
(libos_parser.c:1609:buf_write_all) [P1:T35:Listener] trace: ---- return from vfork(...) = 0x24
(libos_async.c:122:install_async_event) [P1:T35:Listener] debug: Installed async event at 1679016125655580
(libos_fs_lock.c:653:posix_lock_clear_pid) [P1:T23:Listener] debug: clearing POSIX locks for pid 1
(libos_async.c:327:libos_async_worker) [P1:libos] debug: Thread exited, cleaning up
(libos_init.c:303:init_stack) [P2:T36:Worker] debug: Allocated stack at 0x39f249e82000 (size = 0x100000)
(libos_sync_client.c:331:shutdown_sync_client) [P1:T23:Listener] debug: sync client shutdown: closing handles
(libos_sync_client.c:346:shutdown_sync_client) [P1:T23:Listener] debug: sync client shutdown: waiting for confirmation
(libos_sync_client.c:359:shutdown_sync_client) [P1:T23:Listener] debug: sync client shutdown: finished
(libos_ipc_worker.c:285:ipc_worker_main) [P1:libos] debug: IPC worker: exiting worker thread
(libos_async.c:353:libos_async_worker) [P1:libos] debug: Async worker thread terminated
(libos_exit.c:58:libos_clean_and_exit) [P1:T23:Listener] debug: process 1 exited with status 139
(libos_debug.c:94:remove_r_debug) [P2:T36:Worker] debug: removing file:/home/mmelara/app/bin/Listener at 0x2a9969884000
(libos_debug.c:94:remove_r_debug) [P2:T36:Worker] debug: removing file:/usr/local/lib/x86_64-linux-gnu/gramine/runtime/glibc/ld-linux-x86-64.so.2 at 0x2a996984e000
(libos_ipc_worker.c:72:ipc_leader_died_callback) [P2:libos] debug: IPC leader disconnected
(libos_ipc_child.c:23:ipc_child_disconnect_callback) [P2:libos] debug: Unknown process (vmid: 0x1) disconnected

...

(libos_parser.c:1609:buf_write_all) [P2:T36:Worker] trace: ---- writev(2, 0x39f249f80990, 10) = -32
(libos_exit.c:179:libos_syscall_exit_group) [P2:T36:Worker] debug: ---- exit_group (returning 127)
(libos_exit.c:105:thread_exit) [P2:T36:Worker] warning: error clearing POSIX locks: Connection reset by peer (ECONNRESET)
(libos_exit.c:112:thread_exit) [P2:T36:Worker] error: Sending IPC process-exit notification failed: Connection reset by peer (ECONNRESET)
(libos_sync_client.c:331:shutdown_sync_client) [P2:T36:Worker] debug: sync client shutdown: closing handles
(libos_sync_client.c:346:shutdown_sync_client) [P2:T36:Worker] debug: sync client shutdown: waiting for confirmation
(libos_sync_client.c:359:shutdown_sync_client) [P2:T36:Worker] debug: sync client shutdown: finished
(libos_ipc_pid.c:320:ipc_release_id_range) [P2:T36:Worker] debug: sending a request: [36..36]
(libos_ipc_pid.c:323:ipc_release_id_range) [P2:T36:Worker] debug: ipc_send_message: Connection reset by peer (ECONNRESET)
(libos_pid.c:177:release_id) [P2:T36:Worker] warning: IPC pid release failed

dimakuv commented 1 year ago

@marcelamelara I sketched a workaround, but there is no PR/commit/branch that you could actually test: https://github.com/gramineproject/gramine/issues/1168#issuecomment-1425620994

Unfortunately, this is a pretty hard problem. Well, maybe someone will have time to create at least a workaround per my suggestion, so that people can try it out. @marcelamelara I don't know if you'll have the time and will to write the code, but feel free :)

bronzeMe commented 1 year ago

@marcelamelara I sketched a workaround, but there is no PR/commit/branch that you could actually test: #1168 (comment)

Unfortunately, this is a pretty hard problem. Well, maybe someone will have time to create at least a workaround per my suggestion, so that people can try it out. @marcelamelara I don't know if you'll have the time and will to write the code, but feel free :)

@dimakuv ,Hi, Has the gramine community been trying to solve this issue? It seems like quite an important problem.

dimakuv commented 1 year ago

@bronzeMe No, we didn't work on solving this issue (it would involve a serious design change in Gramine). This has a low priority for us currently.

bronzeMe commented 1 year ago

@dimakuv Thank you, but in real-world enterprise applications, there is often a Shell script that triggers many Fork operations. It is common for the IPC leader to terminate prematurely, causing the application to fail. So, this issue is quite critical for enterprise users.

dimakuv commented 1 year ago

@bronzeMe But why would Gramine be used in such cases? Gramine is generally not suited to run shell scripts.

bronzeMe commented 1 year ago

@dimakuv For example, running the Hive CLI inside Gramine.

dimakuv commented 1 year ago

A quick look at https://www.devdoc.net/bigdata/hive-0.12.0/language_manual/cli.html shows me that Hive CLI is not typically used in background mode, so why would this GitHub issue affect your use cases?

bronzeMe commented 1 year ago

For example, when running the command ‘gramine-sgx bash -c ‘$HIVE_HOME/bin/hive -f /home/my/hive-script.sql’’, the ‘hive’ script is a shell script that includes multiple commands such as ‘which’, ‘dirname’, ‘cat’, and ‘grep’. This often leads to the following issue: if an IPC leader process exits, the child process will not be able to reconnect to the IPC leader process again.

dimakuv commented 1 year ago

@bronzeMe But what are you trying to achieve? Why do you need to execute which inside of the SGX enclave? What is your security model, and why is it important to execute these utilities inside SGX?

bronzeMe commented 1 year ago

The fundamental reason why it is not possible to execute the “which” command and shell scripts in gramine is what?

woju commented 1 year ago

reason why it is not possible to execute the “which” command and shell scripts in gramine is what?

The usual reason for such problem is misconfiguration on part of the user, probably the /bin/which executable is not included in trusted_files and/or fs mounts. So if you think there's bug in Gramine, please post your manifest, expected and actual results in a new issue. Otherwise, I suspect there's not actually a bug.

shell script that includes multiple commands such as ‘which’, ‘dirname’, ‘cat’, and ‘grep’. This often leads to the following issue: if an IPC leader process exits, the child process will not be able to reconnect to the IPC leader process again.

This is likely incorrect. If you run gramine-sgx sh -c '...', then IPC leader is the shell process, so unless you try to fork out of this shell (&) and let the main process exit, IPC will work just fine. Again, if you think there's bug, please post your script, manifest, expected and actual results and we'll triage. Otherwise, I don't believe the assertion that >This often leads to the following issue:

Again, I suspect problem is in the manifest, which is a support issue, not a bug.

dimakuv commented 4 months ago

Just an interesting find: the following article describes the same problem ("Docker PID 1 problem") as we have: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem/

The authors solved it by prepending such apps with their Python script that serves as the init (PID 1) process: https://github.com/phusion/baseimage-docker/blob/master/image/bin/my_init

I think what we didn't consider in this GitHub issue is the ability to have a minimal (but properly functioning) init for the Gramine environment.

gramineproject / gramine

IPC leader does not wait for all processes to finish #21

Description of the problem