kristapsdz / kcgi

minimal CGI and FastCGI library for C/C++
https://kristaps.bsd.lv/kcgi
ISC License
276 stars 40 forks source link

Troubles with kfcgi's childs ( setuid ? ) and 'make regress' #27

Closed xse closed 6 years ago

xse commented 6 years ago

Hi,

I have to mention first that this is compiled and executed on an old raspberry pi ( armv6l ) running archlinux (4.14.27-1-ARCH), i can provide details about the system and so on if needed.

At first I was getting this error when launching make regress, on the test-fcgi-abort-validator file, here is what it looked like :

[xse@rpi kcgi-0.10.1]$ ./regress/test-fcgi-abort-validator fcgi.c:685: application signalled read: Connection reset by peer fcgi_hdr_read: header wrappers.c:184: child signal 31 [xse@rpi kcgi-0.10.1]$

Since everything else was building fine i thought it must have been an issue with curl or something like that, but then i realized i couln't launch kfcgi neither :

[xse@rpi http]$ sudo kfcgi -d -v -U http -u http -s /srv/http/run/httpd.sock -p / -- /srv/http/kcgi fcgi.c:685: application signalled wrappers.c:184: child signal 31 kfcgi[9083]: worker unexpectedly exited wrappers.c:184: child signal 31 wrappers.c:184: child signal 31 wrappers.c:184: child signal 31 wrappers.c:184: child signal 31 [xse@rpi http]$

I'm a bit lost here maybe I'm doing something wrong and it's obvious but i don't see it. I tried a whole bunch of different parameters for kfcgi but nothing seems to do the trick.

Here is kfcgi's strace if it helps : https://gist.github.com/xse/2855184cafd3c03006aabe92ab91f55c

setgid32(33) = 0 setuid32(33) = 0 setuid32(0) = -1 EPERM (Operation not permitted)

I might be wrong and this is probably totally unrelated but maybe it's worth to mention it, i really don't know, could this be it ?

It might also be related to the fact that i didn't setup a proper chroot, but i just wanted to try and see if it worked, as the man says "use the root directory if you insist on being insecure." The thing is that if that error was related to that i don't get why i get the exact same error on test-fcgi-abort-validator, that makes no sense to me. By the way here is the result of "strace -ff on the test-fcgi-abort-validator".

The file /srv/http/kcgi is built as stated from here : https://kristaps.bsd.lv/kcgi/tutorial2.html minus a ")" on the while loop line that otherwise causes an error when compiling.

I'm hopping someone have an idea about how to solve this. Although i'm fine with basics like modifying Makefile.configure to make it fit archlinux and so on, i'm stuck there and i can't find any solutions by myself at the moment, i'm too confused it seems like there is so many possible reasons, I'll keep looking into it tho.

If there is anything i can provide to narrow down the issue I'll be more than happy to do so. Thank you for your time.

kristapsdz commented 6 years ago

Very interesting. Since it's the child that's dying, not kfcgi, I'm inclined to think it's something in the sandbox. Can you uncomment the -DSANDBOX_SECCOMP_DEBUG line in the GNUmakefile and see if that gives you any meaningful output in either of these tests? (The Linux sandbox mechanism is... challenging.)

xse commented 6 years ago

Alright i get a few more messages :

[xse@rpi kcgi-0.10.1]$ ./regress/test-fcgi-abort-validator
ssh_sandbox_violation: unexpected system call (arch:0x40000028,syscall:221 @ 0x76e0f0a4)
fcgi.c:685: application signalled
wrappers.c:181: child status 1
/tmp/kfcgi.XXXXDsT7be: Connection refused
[xse@rpi kcgi-0.10.1]$ 

and :

[xse@rpi kcgi-0.10.1]$ sudo kfcgi -d -v -s /run/httpd.sock -p / -- /srv/http/kcgi
ssh_sandbox_violation: unexpected system call (arch:0x40000028,syscall:221 @ 0x51d2c)
ssh_sandbox_violation: unexpected system call (arch:0x40000028,syscall:221 @ 0x51d2c)
fcgi.c:685: application signalled
fcgi.c:685: application signalled
wrappers.c:181: child status 1
kfcgi[355]: worker unexpectedly exited
wrappers.c:181: child status 1
ssh_sandbox_violation: unexpected system call (arch:0x40000028,syscall:221 @ 0x51d2c)
ssh_sandbox_violation: unexpected system call (arch:0x40000028,syscall:221 @ 0x51d2c)
ssh_sandbox_violation: unexpected system call (arch:0x40000028,syscall:221 @ 0x51d2c)
wrappers.c:181: child status 1
wrappers.c:181: child status 1
wrappers.c:181: child status 1
[xse@rpi kcgi-0.10.1]$

Even if I'm launching those command via an ssh shell, i tried doing it from a real keyboard plugged on the raspberry pi and I'm getting the exact same results.

Here is the output of : sudo strace -v -f kfcgi -d -v -s /run/httpd.sock -p / -- /srv/http/kcgi

221 seems to be : 221 common fcntl64 sys_fcntl64 sys_oabi_fcntl64 I don't really get the arm EABI vs OABI difference

I compiled twice the whole thing, both time with -DSANDBOX_SECCOMP_DEBUG, i tried adding : -march=armv6 -mfloat-abi=hard -mfpu=vfp -O2 -pipe -fstack-protector-strong -fno-plt to the cc line of Makefile.configure. Beside an impressive gain in speed when launching kfcgi there was no difference.

kristapsdz commented 6 years ago

You've done great! My hunch was right: the sandbox is puking. (I wrote an article why seccomp is such a PITA.) The correct system call needs to be added to sandbox-seccomp-filter.c. Try adding "fcntl64" to the array preauth_ctrl and preauth_work arrays and seeing if that fixes it. (I can cook you up a patch if you're not comfortable with C.)

xse commented 6 years ago

Awesome that's it ! Everything is working smoothly, thanks a lot.

xse commented 6 years ago

I don't know if this is worth making a new issue, but if i reload my web page quickly by keeping Ctrl+F5 pressed the program exits with that kind of output :

[xse@rpi kcgi]$ sudo kfcgi -d -U http -u http -s /run/httpd.sock -p / -- /srv/http/kcgi
wrappers.c:283: poll: POLLHUP
kcgi: output.c:342: khttp_write: Assertion `KSTATE_BODY == req->kdata->state' failed.
wrappers.c:283: poll: POLLHUP
kcgi: output.c:342: khttp_write: Assertion `KSTATE_BODY == req->kdata->state' failed.
fcgi.c:238: work request empty read
fcgi.c:685: application signalled
wrappers.c:283: poll: POLLHUP
kcgi: output.c:342: khttp_write: Assertion `KSTATE_BODY == req->kdata->state' failed.
child.c:1226: RFC violation: REMOTE_ADDR not set
wrappers.c:343: poll: POLLHUP
wrappers.c:181: child status 1
wrappers.c:181: child status 1
kfcgi[1305]: worker unexpectedly exited
wrappers.c:442: poll: Interrupted system call
parent.c:259: failed to read remote
wrappers.c:283: poll: POLLHUP
kcgi: output.c:342: khttp_write: Assertion `KSTATE_BODY == req->kdata->state' failed.
child.c:1286: RFC violation: SCRIPT_NAME not set
child.c:1252: RFC violation: SERVER_PORT not set
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
[xse@rpi kcgi]$

I tried with different parameters for -l -N and -n, it was giving me the same kind of results. I mean it's not that big of a deal but it might be worth to report that, maybe it's a known thing. I would have expected some kind of warning or something but not an exit.

kristapsdz commented 6 years ago

The KSTATE_BODY assertion is because khttp_body() hasn't been invoked before writing data to the output. Is this still with the tutorial2 example? The "unexpected eof" is simple the error from the child dying that bubbles back up.

kristapsdz commented 6 years ago

That's also weird that SCRIPT_NAME and SERVER_PORT aren't being set. Which server are you using for this?

xse commented 6 years ago

Oh that makes more sense, for SCRIPT_NAME and so on yeah those were not set i didn't add fastcgi.conf to nginx, if i do so that doesn't appears anymore.

Otherwise, looking at the tutorials so far, in tutorial0 i think you're missing a "-lkcgihtml" in the compilation and maybe a "0" in khtml_open(&req, r);

Also, make sample exits with the following errors :

[xse@rpi kcgi-0.10.1]$ make sample
cc -march=armv6 -mfloat-abi=hard -mfpu=vfp -O2 -pipe -fstack-protector-strong -fno-plt -g -W -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes -Wwrite-strings -Wno-unused-parameter    -c -o sample.o sample.c
sample.c: In function ‘template’:
sample.c:164:22: error: ‘struct khtmlreq’ has no member named ‘req’
   khtml_puts(req, req->req->remote);
                      ^~
sample.c: In function ‘sendtemplate’:
sample.c:186:3: error: ‘struct khtmlreq’ has no member named ‘req’
  r.req = req;
   ^
make: *** [<builtin>: sample.o] Error 1

I did not look into that one yet i'll have a look tomorrow.

kristapsdz commented 6 years ago

What's the code you're using for the output having the KSTATE_BODY assertions? (I'll fix the sample right now!)

xse commented 6 years ago

It was the code given in tutorial2, the one using a loop.

kristapsdz commented 6 years ago

On OpenBSD, I was unable to make the crashes---even when running with httperf. (Yes, I saw the extra right-paren in the tutorial2.c example.) I'll download nginx and see if I can reproduce the error.

% doas ./kfcgi -s /var/www/run/slowcgi.sock -d -v -U www -u www -- /cgi-bin/tutorial2
xse commented 6 years ago

Hm, it's occurring only if i keep Ctrl+F5 or just F5 pressed down on firefox, if i press it as fast as possible that doesn't crash. The nginx.conf used It tried setting keepalive_timeout to 0 and worker_processes to auto without success.

xse commented 6 years ago

Update : I've built httperf to try, so far : httperf --server 127.0.0.1 --port 80 --num-conns 1000 --rate 100 --timeout 1 run smoothly without errors.

However if i set --rate to 200 i get it to crash : httperf --server 127.0.0.1 --port 80 --num-conns 1000 --rate 200 --timeout 1

With the same kind of output for kfcgi :

[xse@rpi ~]$ sudo kfcgi -d -v -U http -u http -s /run/httpd.sock -p / -- /srv/http/kcgi
pwrappers.c:283: poll: POLLHUP
kcgi: output.c:342: khttp_write: Assertion `KSTATE_BODY == req->kdata->state' failed.
wrappers.c:283: poll: POLLHUP
kcgi: output.c:342: khttp_write: Assertion `KSTATE_BODY == req->kdata->state' failed.
wrappers.c:283: poll: POLLHUP
kcgi: output.c:342: khttp_write: Assertion `KSTATE_BODY == req->kdata->state' failed.
wrappers.c:283: poll: POLLHUP
kcgi: output.c:342: khttp_write: Assertion `KSTATE_BODY == req->kdata->state' failed.
wrappers.c:283: poll: POLLHUP
kcgi: output.c:342: khttp_write: Assertion `KSTATE_BODY == req->kdata->state' failed.
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
kfcgi[9274]: worker unexpectedly exited
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
fcgi.c:289: failed to read FastCGI cookie
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
[xse@rpi ~]$

EDIT : this seems to only appears when using kfcgi, if i use nginx+uwsgi to run a program i don't get any errors.

kristapsdz commented 6 years ago

I can't replicate this on OpenBSD/AMD64 with the same nginx configuration and all manner of httperf invocations. Can you please try this with tutorial2 asserting on its errors? It makes sense that something high-level is failing because the assertion would raise IFF khttp_body fails somehow.

#include <sys/types.h> /* size_t, ssize_t */
#include <assert.h>
#include <stdarg.h> /* va_list */
#include <stdio.h> /* va_list */
#include <stddef.h> /* NULL */
#include <stdint.h> /* int64_t */
#include <stdlib.h> /* EXIT_FAILURE, etc. */
#include <kcgi.h>

int
main(void)
{
  struct kreq req;
  struct kfcgi *fcgi;
  enum kcgi_err er;

  fprintf(stderr, "Worker starting...\n");
  if (KCGI_OK != khttp_fcgi_init(&fcgi, NULL, 0, NULL, 0, 0))
    return(EXIT_FAILURE);
  fprintf(stderr, "Worker started.\n");

  while (KCGI_OK == khttp_fcgi_parse(fcgi, &req)) {
    er = khttp_head(&req, kresps[KRESP_STATUS], 
      "%s", khttps[KHTTP_200]);
    assert(KCGI_OK == er);
    er = khttp_head(&req, kresps[KRESP_CONTENT_TYPE], 
      "%s", kmimetypes[req.mime]);
    assert(KCGI_OK == er);
    er = khttp_body(&req);
    assert(KCGI_OK == er);
    er = khttp_puts(&req, "Hello, world!\n");
    assert(KCGI_OK == er);
    khttp_free(&req);
  }

  fprintf(stderr, "Worker exiting...\n");
  er = khttp_fcgi_free(fcgi);
  assert(KCGI_OK == er);
  fprintf(stderr, "Worker exited.\n");
  return(EXIT_SUCCESS);
}
xse commented 6 years ago

Yep, i don't get any of those exit messages :

[xse@rpi http]$ sudo kfcgi -d -U http -u http -s /run/httpd.sock -p / -- /srv/http/kcgi
Worker starting...
Worker started.
Worker starting...
Worker starting...
Worker starting...
Worker started.
Worker starting...
Worker started.
Worker started.
Worker started.
wrappers.c:283: poll: POLLHUP
kcgi: kcgi.c:30: main: Assertion `KCGI_OK == er' failed.
wrappers.c:283: poll: POLLHUP
kcgi: kcgi.c:30: main: Assertion `KCGI_OK == er' failed.
wrappers.c:283: poll: POLLHUP
kcgi: kcgi.c:30: main: Assertion `KCGI_OK == er' failed.
wrappers.c:283: poll: POLLHUP
kcgi: kcgi.c:30: main: Assertion `KCGI_OK == er' failed.
wrappers.c:283: poll: POLLHUP
kcgi: kcgi.c:30: main: Assertion `KCGI_OK == er' failed.
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
kfcgi[565]: worker unexpectedly exited
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
[xse@rpi http]$

But something i really don't get is that if i run an strace on the exact same command i get those messages :

[xse@rpi http]$ sudo strace -f -o foo/kstrace.txt kfcgi -d -U http -u http -s /run/httpd.sock -p / -- /srv/http/kcgi
Worker starting...
Worker started.
Worker starting...
Worker started.
Worker starting...
Worker started.
Worker starting...
Worker started.
Worker starting...
Worker started.
fcgi.c:238: work request empty read
wrappers.c:343: poll: POLLHUP
fcgi.c:238: work request empty read
wrappers.c:343: poll: POLLHUP
fcgi.c:238: work request empty read
wrappers.c:343: poll: POLLHUP
fcgi.c:685: application signalled
Worker exiting...
wrappers.c:181: child status 1
wrappers.c:181: child status 1
Worker exited.
kfcgi[748]: worker unexpectedly exited
Worker exiting...
wrappers.c:181: child status 1
wrappers.c:181: child status 1
Worker exited.
Worker exiting...
wrappers.c:181: child status 1
wrappers.c:181: child status 1
Worker exited.
wrappers.c:283: poll: POLLHUP
kcgi: kcgi.c:30: main: Assertion `KCGI_OK == er' failed.
Worker exiting...
wrappers.c:458: read: 3, 2: Connection reset by peer
fcgi.c:289: failed to read FastCGI cookie
wrappers.c:343: poll: POLLHUP
wrappers.c:181: child status 1
wrappers.c:181: child status 1
Worker exited.
wrappers.c:467: read: unexpected eof: read 0 of 2 bytes
fcgi.c:289: failed to read FastCGI cookie
[xse@rpi http]$

I get that strace can modify some timing and so on but i wasn't expecting it to write those messages. What it look like in my logs :

Mar 24 13:42:07 rpi sudo[801]:      xse : TTY=pts/2 ; PWD=/srv/http ; USER=root ; COMMAND=/usr/bin/kfcgi -d -U http -u http -s /run/httpd.sock -p / -- /srv/http/kcgi
Mar 24 13:42:07 rpi sudo[801]: pam_unix(sudo:session): session opened for user root by (uid=0)
Mar 24 13:42:13 rpi kfcgi[802]: worker unexpectedly exited
Mar 24 13:42:13 rpi systemd[1]: Started Process Core Dump (PID 819/UID 0).
Mar 24 13:42:13 rpi systemd[1]: Started Process Core Dump (PID 818/UID 0).
Mar 24 13:42:14 rpi systemd[1]: Started Process Core Dump (PID 821/UID 0).
Mar 24 13:42:14 rpi systemd[1]: Started Process Core Dump (PID 820/UID 0).
Mar 24 13:42:15 rpi systemd-coredump[822]: Process 806 (kcgi) of user 33 dumped core.

                                           Stack trace of thread 806:
                                           #0  0x000000000002dd48 raise (/srv/http/kcgi)
Mar 24 13:42:15 rpi systemd-coredump[823]: Process 804 (kcgi) of user 33 dumped core.

                                           Stack trace of thread 804:
                                           #0  0x000000000002dd48 raise (/srv/http/kcgi)
Mar 24 13:42:16 rpi nginx[513]: 2018/03/24 13:42:16 [error] 515#515: *315 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.16>
Mar 24 13:42:16 rpi sudo[801]: pam_unix(sudo:session): session closed for user root
Mar 24 13:42:16 rpi systemd-coredump[824]: Process 803 (kcgi) of user 33 dumped core.

                                           Stack trace of thread 803:
                                           #0  0x000000000002dd48 raise (/srv/http/kcgi)
Mar 24 13:42:16 rpi systemd-coredump[825]: Process 807 (kcgi) of user 33 dumped core.

                                           Stack trace of thread 807:
                                           #0  0x000000000002dd48 raise (/srv/http/kcgi)
Mar 24 13:42:20 rpi sudo[838]:      xse : TTY=pts/2 ; PWD=/srv/http ; USER=root ; COMMAND=/usr/bin/strace -f -o foo/kstrace.txt kfcgi -d -U http -u http -s /run/httpd.sock -p / -->
Mar 24 13:42:20 rpi sudo[838]: pam_unix(sudo:session): session opened for user root by (uid=0)
Mar 24 13:42:23 rpi kfcgi[841]: worker unexpectedly exited
Mar 24 13:42:23 rpi nginx[513]: 2018/03/24 13:42:23 [error] 515#515: *325 connect() to unix:/run/httpd.sock failed (111: Connection refused) while connecting to upstream, client: >
Mar 24 13:42:23 rpi nginx[513]: 2018/03/24 13:42:23 [error] 515#515: *327 connect() to unix:/run/httpd.sock failed (111: Connection refused) while connecting to upstream, client: >
Mar 24 13:42:23 rpi sudo[838]: pam_unix(sudo:session): session closed for user root
kristapsdz commented 6 years ago

Is it possible for me to have access to this machine somehow?

xse commented 6 years ago

Yeah of course, I've sent you an email on the address displayed on your github page. I hope the connection won't be too slow. edit : i've opened port 22 & 80 on my router

kristapsdz commented 6 years ago

Mystery solved. I still need to fix it, but I now know what it is.

Short answer: nginx will immediately kill connections to the FastCGI socket once the upstream request is terminated. If kcgi is still trying to read or write the response, it exits out.

This happens on the browser (when continuously holding refresh) or httperf (which we were giving a timeout of one second) because nginx would queue connections until one of them took longer than a second, would kill the connection, and kcgi would try to handle the dead socket.

So in many ways, kcgi is doing exactly what it should do. Sees the the socket is dead and exits.

What I need to do is give kcgi a different error code for when the connection closes (a HUP, which is recoverable) and when the process is signaled by the manager to exit. This will allow the front-end to notice process KCGI_HUP in the top-level FastCGI loop by clearing the request and waiting for the next one. Right now, KCGI_HUP is the same code for when the manager tries to kill the process.

xse commented 6 years ago

Awesome !

kristapsdz commented 6 years ago

Ok, this required some serious retooling. Can you check out the latest HEAD and see if it works for you? I've been running the current build on nginx and httpd with all manner of httperf and haven't been able to break anything. Whew!

xse commented 6 years ago

Yep, using the source from git i'm not able to get it working. Here is the output of tutorial2 asserting on its errors :

[xse@rpi:/srv/http]$ sudo kfcgi -d -U http -u http -s /run/httpd.sock -p / -- /srv/http/kcgi
Worker starting...
 [ ... ]
Worker started.
wrappers.c:485: read: unexpected eof: read 0 of 4 bytes
child.c:1690: FastCGI: error reading frame size from control
child.c:2093: FastCGI: unrecoverable error at start sequence
fcgi.c:754: fullreadfd: hangup (terminating)
Worker exiting...
wrappers.c:185: child signal 31
Worker exited.
kfcgi[3810]: worker unexpectedly exited
fcgi.c:742: fcgi_waitread: exit request
 [ ... ]
child.c:2077: FastCGI: worker termination
Worker exited.
[xse@rpi:/srv/http]$

This is happening as soon as i load the webpage, so i can't try reloading it.

By the way I'm getting the same kind of errors on 'make regress' everything is building fine but test-fcgi-bigfiles gives the same output :

[xse@rpi:/home/xse/kcgi-git]$ ./src/kcgi/regress/test-fcgi-bigfile
wrappers.c:485: read: unexpected eof: read 0 of 4 bytes
child.c:1690: FastCGI: error reading frame size from control
child.c:2093: FastCGI: unrecoverable error at start sequence
fcgi.c:754: fullreadfd: hangup (terminating)
wrappers.c:185: child signal 31
[xse@rpi:/home/xse/kcgi-git]$
kristapsdz commented 6 years ago

It's the seccomp sandbox. (Looking with the SECCOMP debug stuff.) I'm looking into it now!

kristapsdz commented 6 years ago

Linux is such a nightmare... Ok, I've now tested this on the arm machine with both regress and also a series of httperf tests. The fixed sandbox did it. Thanks so much for taking the time with this! This coming release is basically all FastCGI due to this. Do you mind if I credit you by name, or would you rather by GitHub id?

xse commented 6 years ago

Awesome ! Indeed I'm not able to make it crash with httperf anymore, using the arguments that made it crash before. I was confused because the tutorial2 asserting on its errors still made crash kfcgi when holding F5, but i tried without the assertions and it's working great ! You can use my name it's fine, i'm not sure that i'm able to close this issue by myself maybe only you can do that. Anyway, thanks a lot :)

kristapsdz commented 6 years ago

Great! Yes, the assertions will cause it to fail---the new manpages will discuss exactly which errors can be returned, and why, in the FastCGI case. In general, now, the KCGI_HUP return value means that the connection has closed, and the FastCGI application should close the connection with khttp_free and then re-enter the khttp_fcgi_parse loop. Thanks again for all of your help!