dunglas / frankenphp

🧟 The modern PHP app server
https://frankenphp.dev
MIT License
6.64k stars 218 forks source link

Question: How Frankenphp is designed to handle multiple apps and isolation? #583

Open dimkasta opened 6 months ago

dimkasta commented 6 months ago

Sorry if this is explained somewhere, but I could not find something in the documentation

In the context of caddy we can have multiple domains pointing to separate fast-cgi backends, and in the context of fpm, we can have multiple pools running in isolation behind separate ports/sockets and hopefully under different users

How does frankenphp handle this? Can we have things like separate worker pools running under different users?

Or are we supposed to run multiple different instances behind a separate frontend reverse proxy like vanilla caddy?

withinboredom commented 6 months ago

I believe the current documentation covers this to some degree: https://frankenphp.dev/docs/config/

If we are missing something, please let us know.

dimkasta commented 6 months ago

Yeah it covers the Caddyfile side partially. But it's not clear how frankenphp deals with fpm. How/if multiple fpm pools can/should be created and how/if you can point the php_server directive to a specific one.

In vanilla caddy, you can connect to a specific pool listening to a socket or port with something like

php_fastcgi unix//sockets/php.sock

Or you can even include templates with arguments defined in the include like

php_fastcgi unix//sockets/{args[1]}.sock

Is there something equivalent in frankenphp?

And assuming a worker setup, does frankenphp take over the worker spawning altogether? Or should we configure specific fpm pools with pm.static, pm.max_children and pm.max_requests ? Cause I believe frankenphp handles these already through things like MAX_REQUESTS

dunglas commented 6 months ago

FrankenPHP implements its own SAPI and doesn't use FPM at all. FPM options are irrelevant and no UNIX sockets are used (everything is handled by threads in the same process).

dimkasta commented 6 months ago

@dunglas thanks a lot for clarifying that. So in case we want isolation and separate users, is there a way to do it with one frankenphp instance? Or do we have to deploy separate frankenphp instances and put them behind a separate reverse proxy?

withinboredom commented 6 months ago

While its possible to run threads as a different user (via seteuid()), the memory is still shared with other threads running as other users (which can create interesting security issues). That being said, it might be interesting to have PHP running as a different user since it is relatively isolated from Go. If each worker pool could be started by a different user and the worker pools don't share memory, then I think we can make it happen... I think it would require Caddy to be run as root though (or set the setuuid bit on the executable). I might play with that this weekend if I have some time.

withinboredom commented 6 months ago

My initial approach was to just set the current user when sending a request to PHP ... oh boy, you don't want to do that:

At the Linux kernel level, credentials (user and group IDs) are a per-thread attribute. However, POSIX requires that all of the POSIX threads in a process have the same credentials. To accommodate this requirement, the NPTL implementation wraps all of the system calls that change process credentials with functions that, in addition to invoking the underlying system call, arrange for all other threads in the process to also change their credentials.

So, it appears that we can simply set the workers' effective user even after creating the pthreads. So, that's nice and can allow each worker pool to effectively have different users applied.

I'm pretty sure this won't affect Go, and if it does, we will just need to make the syscall directly instead of relying on this magic signal stuff, though that might break portability. So, I still need to finish experimentation.

How are we imagining the configuration to look like:

{
    frankenphp {
        num_threads <num_threads> # Sets the number of PHP threads to start. Default: 2x the number of available CPUs.
        worker {
            file <path> # Sets the path to the worker script.
            num <num> # Sets the number of PHP threads to start, defaults to 2x the number of available CPUs.
            env <key> <value> # Sets an extra environment variable to the given value. Can be specified more than once for multiple environment variables.
            user 1000 # set the user id for the worker pool?
            group 1000 # set the group id for the worker pool?
        }
    }
}

I'm not sure it will be possible to set the user/group for cgi-mode, yet. I'll have to give it a go once I get the basics working.

withinboredom commented 6 months ago

Yes, changing the thread pool user changes the entire proccess's user. I guess that is obvious in retrospect.

However, calling the syscall directly does, in fact, work perfectly. However, I'm not sure if it will be effective on OSX. @dunglas, can you apply this on a mac and see if it works:

Subject: [PATCH] Test for setting uid.
---
Index: testdata/index.php
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/testdata/index.php b/testdata/index.php
--- a/testdata/index.php    (revision a6fc22505cc5a42a0ab5f8beaec962760b1fea7e)
+++ b/testdata/index.php    (date 1708772090602)
@@ -4,4 +4,5 @@

 return function () {
     echo sprintf("I am by birth a Genevese (%s)", $_GET['i'] ?? 'i not set');
+    touch('test');
 };
Index: caddy/frankenphp/Caddyfile
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/caddy/frankenphp/Caddyfile b/caddy/frankenphp/Caddyfile
--- a/caddy/frankenphp/Caddyfile    (revision a6fc22505cc5a42a0ab5f8beaec962760b1fea7e)
+++ b/caddy/frankenphp/Caddyfile    (date 1708771942066)
@@ -28,7 +28,7 @@
    #   }
    #}

-   root * public/
+   root * testdata/
    encode zstd br gzip

    # Uncomment the following lines to enable Mercure and Vulcain modules
Index: frankenphp.c
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/frankenphp.c b/frankenphp.c
--- a/frankenphp.c  (revision a6fc22505cc5a42a0ab5f8beaec962760b1fea7e)
+++ b/frankenphp.c  (date 1708772592793)
@@ -13,6 +13,10 @@
 #include <php_variables.h>
 #include <sapi/embed/php_embed.h>
 #include <stdint.h>
+/* todo: check for os */
+#include <unistd.h>
+#include <sys/syscall.h>
+/* end todo */
 #include <stdio.h>
 #include <stdlib.h>
 #include <unistd.h>
@@ -722,6 +726,10 @@

 static void *manager_thread(void *arg) {
 #ifdef ZTS
+   // set the current user
+   syscall(SYS_setreuid, -1, 1000);
+   syscall(SYS_setregid, -1, 1000);
+
   // TODO: use tsrm_startup() directly as we know the number of expected threads
   php_tsrm_startup();
   /*tsrm_error_set(TSRM_ERROR_LEVEL_INFO, NULL);*/

Then simply build frankenphp and run from the root of the repo as root:

SERVER_NAME=:80 FRANKENPHP_CONFIG="worker /go/src/app/testdata/index.php" caddy/frankenphp/frankenphp run -c caddy/frankenphp/Caddyfile

With ps -T aux you should see the Go threads running as root, but worker threads running as user 1000 (or whatever user you set there, assuming the user exists).

When you call http://localhost, you should see the output "I am by birth a Genevese" and testdata/test should be owned by the appropriate user.

dunglas commented 6 months ago

@withinboredom the patch compiles but with warnings:

# github.com/dunglas/frankenphp
frankenphp.c:730:2: error: 'syscall' is deprecated: first deprecated in macOS 10.12 - syscall(2) is unsupported; please switch to a supported interface. For SYS_kdebug_trace use kdebug_signpost(). [-Werror,-Wdeprecated-declarations]
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/unistd.h:748:6: note: 'syscall' has been explicitly marked deprecated here
frankenphp.c:731:2: error: 'syscall' is deprecated: first deprecated in macOS 10.12 - syscall(2) is unsupported; please switch to a supported interface. For SYS_kdebug_trace use kdebug_signpost(). [-Werror,-Wdeprecated-declarations]
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/unistd.h:748:6: note: 'syscall' has been explicitly marked deprecated here

The Mac ps doesn't have a -T option.

withinboredom commented 6 months ago

interesting! I suspect if we use this "feature" it would probably only be supported in linux, simply because the linux kernel supports per-thread credentials. It's more of a hack than anything else.

Thinking about it a bit more, it's probably something we don't want to support for security reasons (for example, people might think they are safe, but due to some assumption in the kernel about posix standards, it allows "escaping the thread" and gaining root).

The Mac ps doesn't have a -T option.

Based on this page, it would be ps -M on mac.

dunglas commented 6 months ago

Here is the output of ps -M:

USER      PID   TT   %CPU STAT PRI     STIME     UTIME COMMAND
dunglas 21092 s005    0.0 S    31T   0:00.02   0:00.01 caddy/frankenphp/frankenphp run -c caddy/frankenphp/Caddyfile
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 
        21092         0.0 S    31T   0:00.00   0:00.00 

I'm also in favor of not implementing this feature and encourage using containers instead.